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RESEARCH IN ELEMENTARY EDUCATION! 
CHARLES H. JUDD 
University of Chicago 


In the last few years the number of scientific studies in the field of 
arithmetic has fallen off as contrasted with earlier years. The number 
of scientific studies of reading, on the other hand, has steadily increased. 
Yet arithmetic and reading are both major requirements in the ele- 
mentary curriculum. Not only so but if either one is urgently in 
need of study it is arithmetic, for in this subject there are more failures 
after the second grade than in any other subject of elementary 
instruction. 

If we inquire why it is that arithmetic is going forward less rapidly 
than reading I think we shall uncover certain fundamental principles 
regarding research which it is highly important for us to consider. 
Arithmetic is comparatively static in its content and methods in spite 
of recent heroic efforts to modify it. The major change which has been 
made in arithmetic in recent years that is significant is a reduction of 
its content. Even this reduction has been very slight. Not only is 
arithmetic static as a body of instructional material but the scientific 
studies which attempt to deal with it are equally static in their methods. 
We have had tests and more tests since Rice made his first tests and 
reported the results in 1901. The tests have told us where arithmetic 
fails and where more effective teaching should attack the subject, 
but the methods of new attack and the methods of further study have 
until very recently been held back by the universal devotion to tests. 

With reading the situation is very different. A new procedure has 
appeared in the schools in the field of reading. This is the procedure 
of instruction in silent reading. No revolution of greater importance 





1 Address before College Teachers of Education, February 23, 1926. 
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has come in elementary education since the time of Horace Mann than 
the introduction of the concept of silent reading. This new method 
of teaching has been a subject of vigorous scientific study. Indeed it 
may be said that the discovery of silent reading and its introduction 
into schools are results of scientific investigation. The investigations 
in this field have also been of a novel and objective type. They have 
employed the methods of analysis. They have employed an elaborate 
technique quite inaccessible to the amateur. Reading is a moving 
subject. Arithmetic is static. 

The work which has been done in arithmetic in recent years is for 
the most part routine work. One investigator after another has done 
with very slight modification just what his predecessor did. There 
is no field in which industry and faithful imitation have been more 
conscientiously cultivated than in the scientific studies of arithmetic— 
unless it be perhaps in that much trodden territory of spelling. 

The lesson which these illustrations ought to teach is the lesson that 
industry will not serve as a substitute for real research. I do not 
want to be misinterpreted. There certainly is a place in education for 
routine. I believe in tests as means of finding out where a pupil 
stands in his work. I believe in tests as valuable devices for the 
inspection and direction of school work by administrative officers. I 
believe in industry. The point I make is that routine and industry 
must not be accepted as satisfying the demand for research. 

Let me take another phase of modern education as a means of 
illustrating my point. We are in the midst of a most important move- 
ment for the reconstruction of the whole school system. The 
elementary school is losing its seventh and eighth grades. This read- 
justment at the upper end of the elementary school is quite certain 
to be followed by a vigorous reconstruction of the first 6 grades. 
When one asks what are the forces which have brought about this 
revolution, one gets vague answers. One learns that social pressure 
has produced the change; one learns that the lengthened school year, 
better equipment of teachers and other like causes have operated. 
The interesting point is that our technical journals which discuss 
scientific problems are almost destitute of papers bearing on this 
upheaval. One finds by reading the technical journals that research 
follows in the trail of that which has been done. We have a few 
studies on the superior retention of the reconstructed school. We 
have some studies on the degree of success achieved in teaching algebra 
in what used to be the grades. One feels, however, that science is 
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not the leader in the movement of reorganization. Science is merely 
the bookkeeper recording transactions which others have initiated and 
executed. 

What I have been saying is that a great deal of our so-called educa- 
tional research is so limited by existing conditions which are only 
vaguely understood and our scientific investigations are so routine 
in their methods that they serve very little to carry education forward. 

I attribute the situation to the fact that we have very little energy 
to devote to true research. I am impressed by the fact that our best 
people in education are for the most part taken away from research 
by urgent practical duties. I find my colleagues in our own and other 
institutions: occupied in making people listen to accounts of their 
discoveries. Most research people literally have to go out and compel 
teachers and superintendents to pay attention to the most recent find- 
ings of educational science. I had a letter not long ago from one of 
the leading superintendents of the country asking for a man for his 
research division. ‘‘I want a man,” he said, “‘who can go before the 
teachers and show them how to use the tests and other modern 
devices.” 

In one field in particular I am impressed with the difficulty of 
getting research carried on. ‘That is in the field of school administra- 
tion. In this field I find some dogmatic statements and I find a great 
deal of very useful practical activity, but little true research. The 
young men of marked ability who ought to be studying the functions 
of school Boards are out in the field trying to deal with them. The 
men of intellectual caliber sufficient to enable them to invent tests 
by which to evaluate the work of principals and superintendents are 
applying ready-made tests on pupils. There is less research going on 
today in the field of school administration than there is in the field 
of industrial management and administration. The reason for this 
is that industry is convinced that it pays to organize itself as highly as 
the best brains procurable can organize it. 

What I have been saying about educational research is said not 
at all because I am pessimistic about its possibilities. I confess I 
am impatient with its slow progress and with the apparent lack of 
equipment in men and time for fundamental work. Having said this, 
it becomes my task to try to suggest ways and means and also kinds of 
research which I believe we should develop. 

My first comment by way of suggesting plans for expansion deals 
with the means necessary. I feel sure that we must draft larger 
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energy for the task than we now command. A few people scattered 
here and there in teacher-training institutions, working in isolation 
with a margin of time taken from their duties as teachers and from the 
enticing business of preparing salable books, will not serve to collect 
the information necessary to guide American schools. We need to 
learn how to greatly reinforce our ranks and how to pool our energies 
as the men in natural science have done through their National 
Research Council. We need to overcome the chief blight which has 
operated to impede educational research, namely institutional jealousy 
and the desire in some quarters to turn research into institutional 
advertising. We need to make school people and Boards of Education 
understand that research is as vital to the life of the schools as it is to 
the development of industry. We need to quadruple the energy that 
is now available for research. 

Having done this—indeed, even before we accomplish it, it is our 
duty through conference of the best minds in our field to decide what 
is most fundamental. There is a possibility of directing research. 
The ambitious individual who is thinking first, as he naturally does, 
of his own career is not likely to start on the investigation of funda- 
mentals. The attractive problems under present conditions are the 
superficial problems. One can get money for a survey, one can get 
glory by attaching his name to another test. These are the minor 
and relatively unproductive jobs from the point of view of science. 
They are perhaps worthy enough in their way, but they become 
exasperating, petty obstacles in the pathway of educational science. 

It is in a spirit of advocacy of a broad cooperative program 
that I offer suggestions regarding what I believe to be fundamental 
problems. 

The first large investigation which we have to make is an investi- 
gation into the typical characteristics of pupils of different ages. It 
is perfectly clear that the school has as its major problem the promotion 
of the development of pupils. As Meumann pointed out years ago, 
we have many a brave start in describing mental development, we have 
much scientific study of infants and then the problem becomes too 
complex for us. We have nothing regarding the total development of 
pupils corresponding to what the physicians call the “natural history 
of a disease’’; that is, we do not know how the learning processes with 
which we are dealing progress stage by stage from inception to con- 
summation. Take, for example, arithmetic. McLellan and Dewey 
told us 30 years ago where arithmetic begins, but no one has told us 
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in any systematic fashion the next and the subsequent stages. I 
think myself that McLellan and Dewey were wrong, but at least 
they attacked a fundamental question and their work ought to have 
been followed before this by a fundamental inquiry into the whole 
problem of the nature of number ideas and their development. 

There is a continuous natural history of number consciousness; 
there is a continuous natural history of reading and of drawing and 
singing and play. Until we know these continuous natural histories, 
we cannot organize the subjects. 

Lest my program should be misunderstood, I hasten to point out 
that I am not ignorant of the charts which have been made out in the 
different subjects showing the so-called grade standards in these sub- 
jects. I regard these grade standards as in very large measure prod- 
ucts of the present school program, in many cases highly unnatural 
products. I do not ask for more grade standards. I ask rather for 
fundamental inquiry into the nature of pupils’ efforts in the primary 
grades and in the fifth grade and such a comparison of the various 
kinds of activities as shall tell us something about the degree of docility 
and the degree of fixity of the pupil’s intellectual processes, something 
of the difference in general intellectual grasp and interest which dis- 
tinguish fourth grade pupils from sixth grade pupils. I conceive of 
this problem as one describing the general intellectual setting into which 
arithmetic or any other study comes in the fourth grade. The problem 
is not one of measuring arithmetic as it is in that grade but of under- 
standing through analysis the way in which it fits or does not fit into 
the pupil’s mode of thought. Is arithmetic too abstract for the 
fourth grade? Is it too systematic to be really understood and there- 
fore likely to be merely learned by heart? Is it natural to a pupil’s 
interests? What are the pupil’s interests at this stage? 

The second fundaimental as it seems to me is a study of the social 
institutions to which we introduce pupils in the schools. I believe 
there is a rich and comparatively unworked field in the history and 
analytical study of social institutions. What is number in the per- 
fected form which appears in the Arabic numerals? What is the 
essential advance of the Arabic numerals over the Roman system? 
How did the race come to make the advance which it did? Or turning 
to drawing, let us ask first why does society have the particular kind of 
esteem for drawing that it does? What is the relation of drawing to 
society and to the individual which makes it so much less important 
in the school program than arithmetic? When we are through analyz- 
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ing the social character of arithmetic and drawing and the other sub- 
jects in the elementary curriculum, we shall be prepared to ask about 
the school as a social institution. What is its place in the life of our 
generation? Let it be noted that I am advocating investigations. 
There have appeared from time to time those who had theories about 
the value of arithmetic and reading and spelling as social instruments, 
and we have often listened to philosophies about the nature of the 
school. My suggestion is that the time has come to investigate these 
problems by empirical methods. 

You will observe that I am trying to get at the fundamentals, not 
merely to deal with current practices. I am not so much interested in 
stimulating inquiries into the present status of the school and its pres- 
ent practices as I am in stimulating a series of inquiries which shall go 
back of the school and acquaint us with the human nature and the 
social demands out of which the present school has grown. 

I am quite certain that after we have made such fundamental 
studies we shall introduce into the schools forms of instruction which 
are not now there. The grade standards which are now available are 
measures of what is. If we go forward in our science merely measur- 
ing what now is, shall we ever be leaders in introducing new materials? 
Already the more progressive practical school people have outrun our 
tests. They are clamoring for a new curriculum and too often the 
best we have to offer them is the average of present practice. The 
time has come when averages will not satisfy. We must study 
fundamentals. 

I am sometimes afraid that the failure of educational research to 
cope with its fundamental problems will result in more superficial 
studies. I see what seems to me to be a marked tendency toward 
superficiality in the studies now being made of the curriculum. I find 
in many quarters satisfaction in the fact that the curriculum is to be 
revised by cutting out all that practical members of the community 
do not actually use. When I ask what is meant by the terms “actually 
use,’ I get an answer which seems to me very far from fundamental. 
Shall we teach only those combinations in arithmetic which we find 
people using in the stores and in their private lives, or shall we recog- 
nize that the use of number has transformed modern life? The very 
idea of precision is a new idea and every modern man and woman uses 
that idea, even when he is not adding 7 and 3. I insist that what we 
need is fundamental research into the history and nature of such social 
institutions as numbers, language, exchange, and industry. 
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The study of the way in which minds mature and the study of the 
way in which society reacts on the individual must be accompanied by 
a study of educational pathology. I feel sure that we tolerate in 
schools a great many insanitary intellectual conditions. We do not 
recognize them any more than the medievals recognized the insanitary 
conditions of their cities. It requires science to find what is whole- 
some and what is not wholesome. Tradition will not tell us. I have 
in mind such a fact as the formal order of the school. A generation 
ago order was one of the chief concerns of the teacher. Our generation 
thinks of order of the old-fashioned type as pathological, unnatural, 
undesirable and unfavorable to intellectual progress. Where did the 
formal order of the last generation come from? Why did that sort of 
disease seat itself in theschool? Howwasit removed? In like fashion, 
I believe we can find clear evidences of pathology in our present 
educational system in the fact that many people do not read after 
they leave school. Why is thisso? If the work of the school is suc- 
cessful, there ought to be a wholesome attitude toward the school 
achievements. That there is a falling off in the use of reading when 
pupils leave schools is a symptom of insanitary intellectual conditions. 

Again I pause to say that what I am advocating is not a study of 
the details of present practice. What I am asking for is a fundamental 
examination of practice in its entirety, including its consequences. 
Schools have measured children. Let them now measure themselves. 

One of the greatest difficulties which must be faced by the advocates 
of a program of the kind which I have been recommending is that 
we have no methods conveniently at hand for making some of these 
investigations. There are methods which the last quarter of a century 
has perfected for measuring present products. Some of these methods 
are so readily usable that anyone can employ them. Their very 
perfection leads to their overuse. The result of their existence is that 
one is less likely to think of strange problems for which there are no 
known methods of investigation. One is overwhelmed by the con- 
ventional methods because they are so easily accessible. There is 
another fact about current investigations; they tend to follow the lines 
where the least expenditure of energy will lead to the most spectacular 
results. It is easily possible in our departments of education to get 
certain easy statistical pieces of work done. On the other hand, 
students are loath to embark on the arduous tasks which involve 


laboratory work of a careful minute type, employing an elaborate 
technique. 
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If we propose the problem of finding out what are the mental 
characteristics of fourth grade boys, the easy answer is that which 
comes from a tabulation of average grade scores for the fourth grade. 
Everybody knows that these average scores do not answer the question. 
There is a physical, mental and moral maturity in the fourth grade 
which is as little described by average scores as the chemical processes 
of combination are described by measuring the ashes and smoke 
produced. The fourth grade has general characteristics which mark it 
as the period of emergency from the primary grades. The fourth 
grade is a period of voluntary assumption of a new attitude toward the 
world of people and of things. If we donot have methods of discovering 
and recording these facts and using them to determine school proce- 
dure, then the sooner we set ourselves to the task of evolving the 
needed methods the sooner our science will emerge from the embryonic 
stage in which it now finds itself. 

I have not enumerated all of the fields in which productive funda- 
mental investigation is needed. I may run rapidly over some of the 
rest. We have had studies of finance and they have all shown that 
somebody must make a careful study of the taxing system. We may 
wait until the economists have made fundamental studies of taxation 
or we may take our share in the examination of our own institution 
in its relation to public support. 

In matters of personnel management, management of teachers and 
management of pupils, we find that school people are either superficial 
or full of quotations of pronouncements made by students of industrial 
organization. Shall we wait for someone to deal with the funda- 
mentals of social supervision outside the school and then try to carry 
over their results or shall we study the fundamental principles of 
human relations in the institution which is supposed to mould these 
relations? 

In the matter of the preparation of textbooks and other mechanical 
aids to instruction we have been peculiarly inert. We leave it to 
interested commercial concerns to invent and perfect educational toys 
and to prepare and print our books. We are dominated socially and 
pedagogically by forces which privately scoff at most of our scientific 
endeavors. Why not make a scientific investigation of the relation 
of the school as a social institution to the industrial forces which 
influence its operation? 

Another line of inquiry has to do with the operations of the class- 
room. Some of the most influential investigations made in recent 
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years have had to do with the problems of classroom procedure, and 
yet anyone who contrasts the facts which appear during observation 
of a good teacher and the recommendations made in even our best 
textbooks on methods knows that the scientific description of teaching 
is in its infancy. 

I close this sketch of fundamental studies which are waiting to be 
taken up by students of the elementary school with the comment which 
reiterates what I said at the beginning. Our science is on the threshold 
of great expansions in scope and influence. The present school is 
to be reconstructed. The process of reconstruction will be a slow 
process of trial and error if our science lags. It can be made into a 
process of rapid, economical evolution guided by fundamental prin- 
ciples if we can secure the energy to devote to research and can release 
this energy from absorption in more recanvassing of current practices 


by methods which have become conventional because they are readily 
used by novices. 





AN EXPERIMENTAL STUDY OF THE NATURE OF 
IMPROVEMENT RESULTING FROM PRACTICE 
IN A MOTOR FUNCTION 


ARTHUR I. GATES 


Teachers College, Columbia University 


AND 
GRACE A. TAYLOR 
University of Pittsburgh 


A problem of great practical and theoretical importance both in 
education and psychology is that of the nature of the improvement 
brought about by practice. Stated more broadly, this is the problem 
of the nature of the changes in human nature which may be brought 
about by education. Concerning the character and limits of improve- 
ment which continued practice may produce there have been, tradi- 
tionally, two main theories. 

1. According to one theory, education or training in a function 
produces improvement by means of the development of techniques, 
mental and motor, without any change in what are usually termed 
fundamental capacities. Thus in learning to play the piano or to 
dance, it is held that improvement results not from improved motor 
capacity, not from inéreased speed, steadiness, facility, etc. of the 
motor responses in a general way, not from a general improvement 
of the motor and neural machinery involved in the tasks, but rather 
from the acquisition of new methods of managing the mechanisms, 
from new patterns of reacting, new knowledge of procedure, new 
methods of work, new techniques, new “tricks of the trade.” Simi- 
larly, in learning to read, it is held that improvement consists not in 
improved memory, perception, intelligence, not in generally increased 
mental or motor capacity but in new information, new methods of 
work, new habits of functioning. The many investigators and 
theorists, who at the present time, hold the conception of “intelligence’’ 
as a native capacity or group of capacities which cannot be appreciably 
changed by education and experience however rich, must subscribe, 
substantially, to this type of theory. 

¢2. A second type of theory affirms what is denied in the first, 
namely, that practice may produce an improvement in the fundamental 
capacities exercised. The older theory, known as the Faculty Theory, 
which assumed that memory, perception, retentiveness, motor adapta- 
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bility, etc. were powers or capacities which could be improved as a 
whole by training in any particular form, is now rarely upheld because 
of the antagonistic results of numerous experiments on the transfer of 
training. Modified in application and form, the same theory may be 
and is still maintained. It is still possible, for illustration, to argue 
that the training of a function brings about an improvement in such 
neurones or other mechanisms as are exercised and that these, funda- 
mentally improved, may operate in other functions. The observed 
transfer of training, in other words, conceivably may be due to the 
increased capacity of identical mechanisms involved in the trained 
and untrained function as well as, or rather than, to the transfer of 
methods of attack, information, techniques or habits acquired. Some 
of the writers who oppose the notion of native and unimprovable 
intelligence and other traits, hold views that at least imply a belief in 
the improvability of the fundamental bases of learning. 

3. To these two more familiar views, may be added a third theory. 
The capacities such as general intellectual capacity, the ‘“‘capacity to 
learn,” retentiveness, motor speed and control, etc. which have been 
investigated are alleged to develop gradually from birth to a maturity 
or maximum reached at various ages rarely later than the twentieth 
year. Since they are subject to growth, it is conceivable that continued 
exercise before the time of maturity might stimulate a more rapid growth of - 
the capacities concerned in a function. Continued exercise by the 
production of nutritive after-effects or by stimulating the production, 
by fuller release or by controlling the distribution of ‘ hormones”’ or in 
other ways now unknown, might increase the rate of development and 
raise the mature level of the factors which underlie capacity, particular 
or general. This theory, applied to education would, if substantiated, 
give new significance to the period of plasticity and increase the impor- 
tance of the early training of youth. 

Since these three theories—which represent but general types 
within which many particular hypotheses may be constructed—have 
not as yet been thoroughly tested, plans for several studies designed 
to throw some light on the relevant facts were made. Inthis paper, 
the results of one study are offered. This study was arranged to test, 
in some measure, the nature of improvement in a motor function. 


GENERAL PLAN OF THE EXPERIMENT 


In the first experiment an effort was made to eliminate the influence 
of technique and specific informational content in order to provide the 
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possibility of measuring the combined effects, if any, of capacity 
improved directly, or indirectly by means of accelerated growth. 
To achieve this end, two requirements must be met: (1) The function 
trained must be one in which a small amount of practice would yield 
the fullest mastery of technique—special skill and knowledge—which 
could affect achievement; and (2) the subjects used must be those in 
whom growth is, in all probability, very rapid. It was our judgment 
that speed of tapping would satisfy fairly well the requirements of the 
function and that children, as young as could perform reliably in a 
continued series of tests, should be selected as subjects. 

After preliminary studies of young subjects and different methods 


of conducting the practice, children in the kindergarten of the Horace 


- Mann School from four to six years of age were selected as subjects 
and 3 daily tests in tapping of 30 seconds each were adopted as the 
practice schedule. The tests were given to the children in groups of 
from 5 to 8. The taps were made with a large blunt pencil on a sheet 
of coarse paper. This device, which proved to be very satisfactory, 
provided permanent records that could be analyzed at leisure. The 
3 daily tests were given consecutively with short rest periods between 
them. 

Practice was begun about the middle of November with 82 children 
whose progress was carefully followed. Improvement, presumably 
due mainly to the acquisition of technique, was rapid and at the end 
of 18 days of practice, if appeared that a limit had been approximated. 
At this time, the pupils were divided into 2 groups equivalent in 
each of the following respects: 

1. Speed of tapping, measured by averaging the best of the 3 
daily records for each of the 18 days of practice. 

Sex, z.e., equal number of boys and girls in each group. 
Chronological age. 

Mental age as determined by the Stanford-Binet scale. 
Intelligence quotient. 

. General motor maturity as determined by a composite of 
teachers’ judgments and a number of motor tests to be described 
presently. 

The tests of motor ability were given before the tapping practice 
was begun. They were to be used not only to secure groups equivalent 
in these abilities but also, by repeating the tests at the end of practice. 
to provide a means of measuring the transfer of improvement from 
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the tapping practice to other motor abilities. Briefly described, 
the tests were as follows: 

1. A page of oblique lines / / / to be completed into X’s. Score 
was the product of number of lines and a rating from 1 to 5 for quality. 

2. Marking X’s as rapidly as possible. Score was product of rate 
and quality. 

3. Marking =’s as rapidly as possible. Score as in (1). 

4. Target Test; a page of small circles, the centers of which were to 
be struck with a pencil. Score as in (1). 

5. Copying Diagram Test. A series of geometrical figures on the 
left side of a page which were to be copied as rapidly and as well as 
possible. Ten seconds were allowed for each figure or 50 seconds for 
the whole test. Score as in (1). 

6. Sorting Pegs Test. A group of pegs of various colors which were 
to be sorted as rapidly as possible into piles of the same color. Score 
is number of seconds required to complete the task. 

7. Speed and Precision of Movement, Test A. A page of converg- 
ing lines between which the subject was to draw lines without touching 
the sides. 

8. Speed and Precision of Movement, Test B. A page of. maze 
pathways bounded by lines between which subjects were to trace 
as in (7). 


The extent to which the two groups were equated in these functions 
is shown in Table I. 





























TABLE I 

‘— oot’, | Mental | iq | Cpigue | Mark 

rene | 1933'| 2 Test | X's 
Group I (practice)................| 38.6 | 4.63] 5.79 | 125 | 10.8 | 35.9 
Group II (non-practice)........... 38.8 | 4.60} 5.83 | 127 9.9 | 32.6 
Speed | Speed 

- and d 

Ming | Tareet | Die’ | Begs’ | Pree | Pre- 

='s gram | secon cision | cision 

A B 

Group I (practice)................ 31.2] 10.1 | 34.5 118 | 4.8 | 9.5 
Group II (non-practice)........... 33.1} 10.8 | 29.9 130 | 5.3 | 8.7 
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Differences between the average scores of some of the motor tests 
appear, but they are not consistently in favor of either group. In 
tapping ability, age, mental age, IQ, and sex, the groups are sub- 
stantially equivalent. 

The groups on which the data in Table I are based consist of 28 
pupils, each of whom completed the entire experiment. Both groups, 
as stated above, in the first place practiced tapping for 18 schoo! days. 
After the eighteenth day practice for one group was halted; for the 
other it was continued throughout the remainder of the school year 
until May 17, a period of six months. The children of the other 
group—the control or unpracticed group—were again given the three 
daily tests for a period of 17 days at the end. The actual numbers of 
practice days were as follows: 








Preliminary | yiain Period| Final Period| Total 

Period 
Practice Group......... 18 76 17 111 
Control Group......... 18 none 17 35 

















REsULTs OF SPECIFIC PRACTICE 


Since the results of the daily tests would be cumbersome to present, 
they have been summarized by utilizing only the best score of the three 





for each day and presenting only the average of these best scores 
for periods of five days, except that, in the Preliminary Period in two 
instances and in the Final Period in three instances, the figures are 
based on the records of four days. Table II gives the data which are 
portrayed graphically in Fig. 1. 
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TaBLE I].—SHOWING THE AVERAGE SCORES IN TAPPING FOR Five Day PErRIops 
Excerpt THosE Marxkep * Wuicu ARE Four Day PERIODS 



















































































Days 1-4*| 5-9 |10-13*|14—18| 19-23/24-28/29-33/|34-38/39-43)|44—48 
Practice Group..... 35.5/38.4| 40.6 | 40.1] 39.8] 39.9] 40.1] 40.7) 40.6) 41.2 
Control Group...... 35 .8/37.5, 40.1 | 40.9 

Days 49-53| 54-58/59-63/64-68 60-73|74-78|70-89 84-88/89-93 
Practice Group......... .| 41.8) 41.0) 42.1) 42.8) 43.2) 41.9) 43.7| 44.3) 44.6 
Control Group.......... 

Days | 94-98 | 99-102* | 103-106* | 107-111" 
te ee 44.7 44.2 45.0 | 44.8 
Control Group... ..... Fs scees, 42.3 42.2 | 45.2 | 44.3 

8 ; 











Table II and Fig. 1 show a rapid rise in ability to tap during the 
first 10-13 days for both groups. For the trained group, improvement 
thereafter is slow but continuous until the end. The Control Group, 
after a period of nearly six months without practice, shows during 
the first week an improvement over its last practice period but a score 
inferior to that of the PracticeGroup. After about 11 days of practice, 
however, the untrained group overtakes the other and thereafter the 
2 groups are equal in ability. 

It appears to us that these facts should be interpreted as follows: 
The improvement brought about by practice is due to acquired 
methods of work, mental and muscular adaptations to the task and 
to the conditions of the tests, to techniques of various sorts. In the 
case of this function, these techniques are quickly acquired. Further 
improvement—as shown in the results for the practice group—is.slow 
and due to growth of whatever abilities or capacities are involved in 
this function. Such capacities are not improved by such further 
training as was given over a period of nearly six months either directly 
or indirectly by a stimulation of growth. In other words, of the 
three theories suggested at the beginning of the paper, only the 
first seems to harmonize with the experimental findings obtained 
in this study. 


j 
* 
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THe TRANSFER OF TRAINING TO OTHER Motor FUNCTIONS 


Since the Practice Group did not excel the Control Group at the 
end of the experiment, it would scarcely be expected to show a superi- 
ority as the result of transfer in the various motor functions tested 
before and after training. Inasmuch as the results of the transfer 
tests are at hand they should be surveyed, however, since the unex- 
pected sometimes happens for reasons that appear only after the 
happening has been studied. The results are shown in Table III. 


TaBLe III.—AVERAGE SCORES OF THE TRAINED AND UNTRAINED GROUPS IN 
Various Moror TEssts 








Speed | Speed 

Oblique | Mark- | Mark- | Tar-| Copy | Sort and | and 

Line ing ing get Dia- Pegs, | Pre- | Pre- 

Test X’s =’s | Test | grams | seconds| cision | cision 

A B 
I 10.8 35.9 31.2 | 10.1) 34.5 118 4.8 9.5 
Practice Group.......... F| 20.7 50.5 44.6 | 13.9) 38.7 78 5.0) 13.6 
G 9.9 14.6 13.4 3.8 4.2 40 0.2 4.1 
I 9.9 32.6 33.1 | 10.8) 29.9 130 5.3 8.7 
Control Group........... F/ 15.9 45.2 48.3 | 14.7) 34.2 86 5.7| 13.0 
G 6.0 12.6 15.2 3.9 4.3 44 0.4 4.3 
Difference in gains in favor 

of Practice Group........... 3.9 2.0; —1.8 |-—0.1) -—0.1 —4 —0.2; —0.2 
Se ae 0.4 0.2 0.3 0.3 0.4 2.5 0.4 0.4 





























The results of Table III are so inconsistent and the difference in 
gains so small as to produce no incontestable evidence of a transfer 
from the long period of practice in tapping. The tapping group 
appears to excel in crossing oblique lines and marking X’s but the con- 
trol group is superior in marking = signs. In the other tests, the 
differences are negligible. 


PossIBLE LIMITATIONS OF THE STUDY 


Although there appears neither in the results of the function 
specifically trained nor in those from the transfer tests evidence indicat- 
ing that practice increases capacities, narrow or broad, either directly 
or indirectly by the stimulation of growth, there remain several possi- 
bilities that merit study. 

1. There is a possibility that the “techniques” practiced during 
the long period may be more permanently retained than those which, 
although equally influential on the score at the time—as during the 
last seven days of the experiment—were given less exercise as in the 
case of the Control Group. 
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2. There is a possibility that the Practice Group, due conceivably to 
staleness or loss of zeal, could have done better than they did near the 
end of the long period and that after a substantial interval of rest, they 
might excel the other group. 

3. There is a possibility that the continued practice did stimulate 
growth but that such growth is slow and might continue, or occur 
mainly, during a period following the close of the practice. 


REsuLtTs oF Test REPEATED AFTER Srx MontTuHs or DISUSE 


These possibilities gave sufficient reason for studying again as 
many of the same children as could be secured in November, 1924, just 
a year after the first tests were given and approximately six months— 
a period which included the summer vacation—after the completion 
of the tests in May. 

Of the original group of 28 pairs of subjects, 17 equivalent pairs 
were available. These 2 groups of 17 pupils each were equivalent in 
the tests given the year before but they were not equivalent to the 
original whole groups with whom they should not be compared. 

Both groups were given six days of practice in tapping as before 
and thereafter 6 of the original transfer tests. All scores were tom- 
puted as before. Table IV gives the results. In the tapping tests 
are compared: (1) The average of the 3 best scores obtained during 
the first 18 days in November, 1923; (2) the 3 best scores during 
17 days in May, 1924 and (3) the 3 best scores during six days in 
November, 1924. 


TaBLE IV.—ScoreEs IN THE SEVERAL TESTS FOR THE Practice Group (PG) aNnD 
Contro.t Group (CG) 





























Thenien| Marking Mark Mark 
oblique lines X’s =’ 
PG CG | PG | CG PG |cG| PG | CG 
Average of 3 best scores, Nov., 

ey ee prem Fe 40.7/40.3) 10.5 } 11.1 |35.9|38.0)31.4) 35.8 
Average of 3 best scores, May, 

Bn he Codclee «Rate wee ae 49.0|\48.8) 17.5 | 17.7 |50.2/55.3)43.7| 45.7 
Average of 3 best scores, Nov., 

LIRA Papen ate Saas I 56.1/57.3) 20.7 | 22.5 |89.1/89.5|80.4) 76.7 
Gain May over Nov., 1923....... 8.3) 8.5) 7.0 | 6.6 |14.3)17.3)12.3) 9.9 
Gain Nov., 1924 over May, 1924..| 7.1) 8.5| 3.2 | 4.8 |39.9/34.2/36.7| 31.0 
Gain Nov.,1924 over Nov., 1923 |15.4/17.0; 10.2 | 11.4 |52.2'51.5/49.0) 40.9 
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TaBLeE [V.—Continued 





























Target Test tr ing | Sorting Pegs 
| 
PG | ca | pa | ca | FG | OG 
sec. | sec. 
Average of 3 best scores, Nov., 

Dh <osknetudweacbh ob hh ocmees 10.5 | 10.5; 34.5 | 28.9/{ 111 |131 
Average of 3 best scores, May, 

Duar dau kveacviteneeeees 14.4) 15.1] 44.2] 50.6 72 82 
Average of 3 best scores, Nov., 

RRR EINE pee a ea ts ae 17.7 | 16.7 | 69.3] 83.9 65 74 
Gain May over Nov., 1923........ 3.9} 4.6 9.7 11.7 39 49 
Gain Nov., 1924 over May, 1924...| 3.3 1.6} 25.1 33.3 7 8 
Gain Nov., 1924 over Nov., 1923...| 7.2 6.2 34.8 55.0 46 57 





In the tapping tests, the Practice Group is not superior to the Con- 
trol Group after the six months of rest. Both groups made substan- 
tially the same improvements during each of the six months period. 
In the several other motor tests, such advantages as appear shift 
from 1 group to the other; in fact, the PG excels in 3—marking X’s, 
marking =’s and the Target Test —while the CG excels in the other 3 
functions—marking oblique lines, drawing diagrams and sorting pegs. 

These results, we believe, make improbable the possibility that the 
lack of superiority of the trained group at the end of the practice 
period— May, 1924—-was due to temporary boredom or staleness which 
cessation of practice would remove or to insufficient time for a stimu- 
lated growth to make itself manifest. There is no evidence that the 
prolonged practice stamped in the techniques of work more per- 
manently than the shorter distributed periods. In sum, except for 
the improvement in methods, devices, ‘‘tricks of the trade,” and 
adjustments to the test situations, which are rather quickly acquired 
and renewed, no permanent effects of practice due to improved 
capacity or stimulated growth or other causes, can be detected. 

Reviewing the whole study, only one explanation of the results 
other than those given occurs to us; namely, the possibility that daily 
practice in tapping for 111 days over a period of six months might 
have resulted in such an ebb of zeal as to make the practice impotent. 
There were times, in fact, when enthusiasm in the tests was by no 
means great just as there usually are in continual practice of music, 
reading, arithmetic and most other function. The tests were given 
by experimenters who possessed marked skill in managing and inter- 
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esting children of these ages. Various devices such as displays of 
the scores, contests to surpass past records, and rewards of praise 
and of more substantial form, were constantly utilized. On the 
whole, efforts to improve seemed to be genuine. 


SUMMARY AND CONCLUSIONS 


The main facts demonstrated by the experiment are as follows: 

1. Young practiced children improve rapidly in speed of tapping 
during the first dozen or so days of training. 

2. After the initial gain, improvement during approximately 100 
days of practice, on nearly every school day for a period of nearly six 
months, is slow but steady. 

3. A group of children who were given no practice between the 
eighteenth and ninety-fourth days were, in their first tests after the 
interval of no training, better than on the eighteenth day but inferior 
to the equivalent practice group on the ninety-fourth day. The 
unpracticed group improved rapidly, however, and in about ten days 
was equal to the trained group. 

4. The group which practiced for 111 days showed no superiority 
at the end to the group which practiced 18 days at the beginning and 
17 days at the end of the period, in a series of tests of speed and control 
in motor functions. 

5. After a period of nearly six months—including the summer— 
of no practice, both groups were found to be substantially equal in 
tapping and in 6 other tests of motor speed and control. 

These facts we interpret as follows: 

Improvement in tapping is due to two influences: (1) A steady 
growth or innate development or maturing of the neural and other 
mechanisms concerned and (2) the acquisitions of special and subtle 
working methods or techniques which are quickly acquired and which, 
although likely to be lost, at least in part during a period of disuse, are 
nevertheless quickly regained. The improvement brought about by 
practice seems to be wholly due to the acquisition of subtle techniques 
of work and adaptations to the working conditions. These results 
seem to be in harmony with the first of the theories concerning the 
nature of improvement mentioned at the beginning of the paper. 
Long continued training, motivated as effectively as practicable, did 
not, as we interpret the results, improve the neural and other machinery 
in any direct manner in accordance with the second theory or indirectly 
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by means of stimulating and accelerating the growth of these factors 
or “capacities” in accordance with the third theory. 

The findings seem to shed some light on the general theories con- 
cerning the natures and relations of native and acquired ability, of 
capacity and proficiency, of which the problem of the nature of intelli- 
gence and its relations to training, isaninstance. By capacity may be 
meant the functional possibilities of the neural and other mechanisms 
which result in a degree of ability without highly special or intensive 
practice; factors which, as indicated by the results of the experiment, 
grow slowly but steadily both in those who are and in those who are 
not subjected to special training. Upon these factors, special training 
has no perceptible effects. 

Such factors as underlie capacity are sufficient to produce marked 
differences in ability among individuals equal in training or in the 
lack of it. A person’s proficiency normally depends partly upon capac- 
city due to inner growth and partly upon techniques, methods of work, 
adjustments to the task, knowledge of procedures, etc., which are the 
results of practice and experience. In the case of the function here 
studied, the influence of native capacity is far greater than the effects 
of highly intensive and prolonged training. 

Anticipating this result, tapping was chosen for the experiment 
inasmuch as it promised to make possible the differentiation of the 
factors. In other functions, the relative influence of training will 
doubtless differ very greatly from that found in this study. In such 
other functions, it may nevertheless be important both for theoretical 
and practical purposes to distinguish capacity and proficiency. 

Whether the general facts found and the distinctions made in this 
study will hold for subjects of other ages, or are peculiar to young chil- 
dren and whether they will be found in other functions, motor and 
mental, or are unique in the case of tapping, only further studies will 
reveal. One other study, utilizing similar subjects, but a different 
function—memory for) series of orally presented digits—and+some 


variations in the experimental procedure, has resulted in conclusions 
similar to those here described.' 





1 Journal of Educational Psychology, December, 1925, pp. 583-592. 
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TECHNIQUE AND DEVICES USED IN RADIOGRAPHIC 
STUDY OF THE WRIST BONES OF CHILDREN 


THOMAS M. CARTER 
Albion College 


In an earlier article the writer, in conjunction with Dr. Frank N. 
Freeman, set forth some of the findings resulting from a radiographic 
study of the carpal bones of children as a means of determining 
anatomical maturity and its relation to mental development and other 
allied subjects. In a single brief article which dealt more particularly 
with results than with methodology of the investigation it was impos- 
sible to do more than make a few brief references to the technique 
used in the study. 

Since the appearance of the above mentioned article a number 
of requests have come to the authors asking that more detailed 
information relative to technique be given. The present article is 
an attempt to comply with such requests. The reader is referred to 
the above mentioned earlier article since we shall attempt not to 
repeat what is set forth therein. 

Anatomists and pediatricians are generally agreed that skeletal 
development is the best index of anatomical maturity and they are 
agreed also that the development of the bones in the wrist is the best 
index of skeletal development. Therefore the reliability of the index 
of anatomical maturity as constituted by the bones of the wrist was 
accepted in the beginning as an hypothesis. The investigation, 
we think, indicates in a number of ways the validity of the assumption. 
However, it is recognized that this is only one of a number of indices 
of anatomical maturity and that a complete knowledge of any stage 
or stages in the maturing process would require taking into account 
all of the other indices. 

The subjects used in this study were radiographed on or near 
their birthday. If the subject was as much as 30 days or more 
removed from his birthday the radiograph was considered disqualified 
for use on that account. Utmost care was taken to determine the 
exact age of each subject. In a few instances where questions arose, 
birth certificates were secured from reliable parties. 


1 Freeman, F. N. and Carter, T. M.: A New Measure of the Development of 
the Carpal Bones and Its Relation to Physical and Mental Development. Journal 
0) Educational Psychology, Vol. XV, No. 5, pp. 257-270. 
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STaNDARD POSITION 


In the early stages of the study it was found that the position of 
the hand at the time of radiographing had much to do with the 
appearance of the radiograph and that considerable alterations of the 
size and shape of the projections of the different bones of the wrist 
could be produced by certain hand flexions. A standard position 
was determined upon, therefore, to produce radiographs that would 
be comparable. 

After experimentation, a position was selected in which the hand 
and wrist were in what has been designated the “‘straight position”’ 
when the long axis of the third metacarpal bone nearly corresponds to 
that of the forearm. Fick! selects this as the standard position with 
which other positions may be compared. We shall use this position 
as a normal one and all variations and changes will be designated with 
reference to this one. 

In this position the planes of the intervals of separation between 
the various carpal bones are more nearly vertical than in any other 
position. 

Our experiments showed that most hand flexions reduce the size 
of the carpal quadrilateral from the normal and increase the size of 
the projections of the individual bones. Thus the ossification ratio, 
as an index of carpal development, is considerably distorted by a 
relatively small hand flexion.’ 

The following technique of positioning the hand was used through- 
out the investigation. The subjects were seated when the radiographic 
exposures were taken. . Seats of different height were used for the 
different sized individuals. For each subject a seat was used which 
was just high enough so that when the right hand and forearm lay 
on the table with the hand and wrist placed upon the film, the hand, 
forearm, and upper arm were in a straight position with the arm 
extended parallel to the table and forming with the body a right angle. 
This secured the best possible position for the carpal bones. But we 
wished also to secure such a position of the forearm that the radius 





1 Fick, Rudoiph: Ergebnisse einer Untersuchung der Handbewegungen mit 
X-Strahlen. Verhandlungen der Anatomisohen Gesellschaft, May, 1901, pp. 175- 
184. 

2 The terms “carpal quadrilateral” and “ossification ratio”’ were explained in 


the earlier article and are more particularly specified in the latter part of this 
article. 
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and the ulna would be so rotated that the joint of separation would 
be vertical. This position was secured by having the subject face the 
table and place his left shoulder against the table upon which the right 
arm lay with the top of the shoulder in a plane with the top of the 
table. The back of the right hand was placed upward throughout 
the study. 

The radiographs were made upon 5 X 7 Eastman duplitized films. 
A film holder containing a film was placed in an outlined quadrilateral, 
on the table, the size of the holder. This quadrilateral was first 
drawn upon a sheet of paper which was pasted to the table. Later, 
however, thick pieces of cardboard were tacked to the table to inclose 
the film holder quadrilateral. The X-ray tube was adjusted, each 
time before using, so that the long axis of the tube was perpendicular to 
the top of the table and so adjusted with respect to the film holder 
quadrilateral that the rays would fall vertically upon the hand. The 
hand was so adjusted that the carpus would be near the center of the 
focus. The tube was kept at a constant height from the table and in 
this position the film was about 20 inches from the origin of the X-ray. 
Three seconds was the normal exposure time, but slight variations 
were allowed according to the size of the hand. 

What is commonly known as the portable X-ray apparatus was 
the machine used for taking the radiographs. The cabinet measured 
31 inches in height and occupied a floor space of 17 X 17 inches, height 
over all, with tube holder extended, 63 inches. The tube holder was 
extended to its full height. By means of the rectifying switch the 
voltmeter was made to read 110, which with a spark gap of 3 inches, 
gave a 10 milliampere output. The tube was inclosed in a leaded glass 
jacket, but to further insure the subject against any possible danger 
from the X-rays, a rubberized protective cloth impregnated with lead 
hung between the subject and the tube. 

A lead film marker containing an easily manipulated numbering 
device was used to identify the films. Each subject was given a number 
which, with his name and age and any other important data, was 
recorded at the time the radiographic exposure was taken. The 
exposures were taken either by Dr. Monilaw, the school physician or 
by the writer. Both closely adhered to the same technique of taking 
the exposures. Some of the films taken in the early part of the investi- 
gation, before a standard technique had been worked out, had to be 
rejected as unreliable because the hand was improperly positioned. 
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MEASURING THE FILMS 


From the beginning of the study it was determined that some objec- 
tive method should be used in calculating the degree of ossification 
which had taken place in the bones of the wrist. Just what method 
would be used had to be determined by experimentation. 

As was pointed out in the former article the carpal bones are so 


irregular in shape that any kind of a diameter measurement could not ° 


be used with any degree of accuracy. Only a measure of the circum- 
scribed area of the bones could be used to advantage and this could be 
done best by the use of the planimeter. 

The planimeter used in our study for measuring the projected area 
of the bones was one of the simpler type of instruments affording a 
one to one unit of measure, the size of the unit being .01 of a square 
inch. A smaller unit would have been more discriminating, but it 
could not be obtained in a one to one unit planimeter. A different 
type of instrument would have required interpolations and mathe- 
matical computations which would have increased the possibility of 
inaccuracy by complicating the process. 

The next problem confronted was that of measuring the films with- 
out injuring them. After several devices had been employed we made 
use of a discarded film from which the emulsion had been removed by 
means of hot water. This together with the negative and a piece of 
plate glass cut to the same size were securely clamped together. 

The top of the plate glass being the plane upon which the planim- 
eter must operate, it furnished no opportunity for anchoring the 
pole of the planimeter and did not furnish enough friction for the wheel 
of the planimeter. This difficulty was overcome by clamping a thin 
piece of cardboard over the area of the plate glass covered by the 
planimeter in its operations. 

In measuring the projections, a point of origin was designated by 
an ink dot made on the celluloid, just over some point on the edge of 
the projection of one of the bones. The tracer of the planimeter was 
then placed on the dot and the readings taken and written down. 
The projection was then carefully circumscribed and a second planim- 
eter reading taken. When the first number was subtracted from 
the second, the remainder indicated the area of the projection in units 
of .01 of a square inch. The eight carpal bones and the epiphyses of 
the radius and ulna were thus measured. The inspectional and meas- 
urement record was kept on a chart provided for each hand studied. 
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The area of the projections of the carpal bones when added together 
represent the amount of ossification in absolute terms. However it 
was soon seen that the amount of ossification thus expressed was not 
a good index of anatomical maturity as determined by the bones of 
the wrist. Variations in the size of the bones were due partly to the 
size of the hand and partly to maturation of the hand. It seemed 
evident that a measure was needed to represent the size of the hand, 
which could be used as the denominator of a ratio to represent the 











Fig. 1—The subjoined radiographic print is of a hand of a normal twelve year old 
girl, The bones are named according to the ‘‘Basle Nomina Anatomica”’ formula. The 
carpal quadrilateral, as used in our investigation, is also illustrated, 


relation of the size of the carpal bones to the size of the skeleton as a 


whole. 

After careful study it was found possible to determine four points 
within the carpal region which, when connected by straight lines, would 
form a quadrilateral, the extent of which would be determined by the 
size of the hand. The quadrilateral is illustrated in Fig. 1. This 
quadrilateral can be described by tracing the outline in anti-clock 
fashion; beginning at the lower right angle. The first point taken is 
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at the outside extremity of the distal end of the ulnar shaft. The 
second corner of the quadrilateral is the most proximal point on the 
fifth metacarpal. The third corner is determined by the nearest point 
on the epiphysis of the first metacarpal. The fourth corner is 
determined by a point on the outside extremity of the distal end of the 
radial shaft. The area of the carpal quadrilateral was, as in case of the 
_ wrist bones, determined by planimeter measurement. 

Thus we have two measurements, the area of the carpal quadri- 
lateral which is determined by the size of the hand, and the total 
ossificated area in the wrist region, which is determined by the size of 
the carpal bones and the epiphyses of the radius and ulna. By relating 
these two measures we obtain a ratio which represents the degree of 


ossification which has taken place in the wrist bones. Thus the ratio 


cana Fete ai °22 be used as an index of the ossification of the 





wrist bones and as an index of anatomical maturity insofar as it is 
shown by the ossification of these bones. 

In the more advanced stages of ossification the sides of the quadri- 
lateral do not include-the entire area of the bone projections which 
were measured. It is true also that in the more advanced stages 
of maturity there is a considerable overlapping of some of the projec- 
tion areas. Therefore the ratio of the total ossification to the carpal 
quadrilateral becomes, in some of the older ages, more than 100 per 
cent. This, however, does not affect the reliability nor detract from 
the validity of the index determined thereby. 

Having arrived at this point in the investigation it seemed well to 
check on the whole series of methods to see if they had been sufficiently 
standardized so that like results would be obtained by repeated meas- 
urements by the same worker and by measurements of the same 
radiographs by different workers. 

Mr. E. J. Brown, a graduate student in the school of education, 
was at that time working on the problem of determining the reliability 
of a group of psycho-physical and mental tests. Mr. Brown agreed to 
include this test along with others which were being studied. Accord- 
ingly he was instructed in the technique which had been worked out 
and he proceeded to measure the radiographs of 15 hands. The same 
subjects were then radiographed again and Mr. Brown made a second 
series of measurements. The coefficient of correlation between the 
two series was .97. The present writer measured the same subjects 
and found that his two measurements gave a coefficient of correlation 
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of .98. The measurements by the two workers gave a correlation 
coefficient of .96. The correlations were surprisingly high and we 
felt that they were sufficient to justify our technique, and accordingly 
it was adopted. 

While the measurement investigation was being carried on, three 
advanced students in education made some inspectional studies of a 
part of the radiographs. The first of these was Mr. W. H. Buchanan 
who made inspectional rankings of the nine-, ten-, and eleven- year old 
groups of boys and girls and correlated his rankings with the measure- 
ment rankings. He found the following coefficients of correlations: 
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The other two students who worked with problem were Mr. J. G. 
McElhannon and Mr. Everett Davis. They studied the seven-, eight-, 
and nine-year old groups of boys and girls, the ten-year old group of 
girls, and the fourteen-year old group of boys, and correlated their 
rankings with each other and with the measurement. The results are 
shown in Table II. 


TABLE I.—MEASUREMENTS OF INSPECTIONAL AND MEASUREMENT RANKINGS 








Seven- | Seven- | Eight- | Eight- | Nine-| Nine-| Ten- — 
year | year | year | year | year | year | year 
boys | girls | boys | girls | boys | girls | girls dosed 
oys 
McElhannon with 
measurement....| .85 91 .82 .73 .75 .62 | .41 .53 
Davis with meas- 
urement........ .86 .62 .83 .73 .78 .73 | .87 .92 
McElhannon with 
i cn vehen .49 81 91 91 .88 .69 | .51 .69 





























Age Norms in the Ossification Process.—By the expression age norms 
is meant the central tendency with respect to the degree of ossification 
as it presents itself at each chronological age. We have computed 
the mean, the standard deviation, and the probable error of the mean 
of the distributions for each separate bone, for the total ossification, 
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for the carpal quadrilateral, and for the ossification ratio of each age 
group of boys and girls separately, from 5 to 17 years of age. 
However, the original data symbolized by these statistical devices 
seem to be of too great importance to be omitted from the complete 
report, as it appears in the thesis from which this article is taken.! 
Because of the difficulty of securing subjects for some of the ages 
studied, we were able to extend the sample only to 20 of each sex 
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Fic. 2.—Growth curves for individual bones of wrist, boys 5-17 years of age. 


group, or 40 for each chronological age. This, however, was thought 
sufficient to be fairly reliable since a fairly homogeneous group was 
obtained by dividing the subjects into sex groups. 

The material for the sex and various age groups can be fairly well 
set forth in graphs. Some of these graphs appeared in the previous 
article. The others appear here in Figs. 2 and 3. In Fig. 2 are set 





1 Carter, T. M.: A Study of Radiographs of the Bones of the Wrist as a Means 
of Determining Anatomical Age. Unpublished Doctor’s thesis, Department of 
Education, University of Chicago, 1923. 
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forth the curves for the various individual bones of the wrist of boys. 
Figure 4 presents the same material for girls. 

The foregoing graphs show that any element, whether it be the 
total ossification, the carpal quadrilateral, or the ossification ratio, 
exhibits considerable variability within each distribution of both 


boys and girls. 


But the norms for either sex show that the mean 


of the distribution for any of these elements increases in a fairly 
regular manner as the ages increase. The standard deviations differ 
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Fic. 3.—Growth curves for individual bones of wrist, girls 5-17 years of age. 


considerably between the two sexes at certain ages and on the whole 
the standard deviations for boys are greater than the standard devia- 
tions for girls. The standard deviations for the ossification ratio 
distributions are greater with the younger ages than with the older 
If we divide the 13 ages studied into three groups with ages 


ages. 


5 to 8 in one group, ages 9 to 13 in another group, and ages 14 to 17 in 
the third group, we find the average standard deviation of the ossifica- 
tion distributions for the respective groups to be as follows: 
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Figures 2 and 3 show that the general tendency is for the curves 
of the individual bones to parallel each other. There is, to be sure, a 
considerable amount of crossing of some of the curves but this is due 
to the fact that the order in which the bones begin to ossify is not 
according to the size which the bones will ultimately reach. Some 
of the larger bones appear relatively late and then grow more rapidly 
and continue their growth longer than some of the smaller bones 
which have preceded them in order of appearance. 

Curve 5, representing the os triquetrum, starts in the fourth place 
from the top but drops to the seventh place at the upper end of the 
curve. This is true for both boys and girls. The curve is crossed by 
Curves 3, 4, and 7 representing respectively the os naviculare, the 
os lunatum, and the os multangulun majus. Curve 3, representing the 
os naviculare occupies seventh place from the top at its lower end but 
rises to fourth place from the top at the upper end of the curve. This 
is true for both boys and girls. 

Curve 9 representing the os capitatum, starts highest at the lower 
end of the curve for both boys and girls and holds this place without 
competition up to 14 years of age. From this time on it is about 
equalled, in case of the girls, by Curve 1, representing the epiphysis 
of the radius. Curve 1 crosses Curve 9 in case of the boys between 
15 and 16 years of age and is considerably higher than Curve 1 at the 
upper ends of the curves. 

From the curves of the individual bones, for both boys and girls, 
it would appear that if one wished to select one bone which could be 
used to the best advantage to represent the degree of development 
which had taken place at any given age in the wrist region, the os 
capitatum would seem best adapted to such a purpose. It increases 
in size from age to age in a more regular fashion than any of the other 
bones with the exception perhaps of the os triquetrum and the os 
hamatum which are about equal in regularity of development. The 
os capitatum is therefore as regular in development as any of the bones 
and more regular than most of them. It is the largest of any of the 
carpal bones; therefore the difference in size from age to age would 
be more discriminating than for any of the other bones. 
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CONCLUSIONS 


I. The process involved in securing the radiographs of the wrists of 
children to be used in a study of anatomical maturity must be stand- 
ardized and the standard must be closely adhered to if the radiographs 
are to be comparable. This.is shown by radiographing and measuring 
the bones of the wrist with the hand in different positions. 

II. The objective or measurement method of determining the 
degree of ossification is more reliable than the inspection method. 
This is shown by the fact that estimates of the same individuals made 
by the same judge at different times and the estimate of different judges 
of the same individuals were correlated higher when the measurement 
method was used than when the inspectional method was used. 

From this and the previous article two other general conclusions 
may be added: 

1. The measurement of the absolute size of the carpal bones is 
not a reliable method by which to determine the degree of anatomical 
maturity as shown by the ossification of these bones. Proof of this 
is to be found, for example, in the fact that after 12 or 13 years of 
age the absolute size of the carpal bones of boys is larger than the 
size of these bones for girls of the same chronological age although all 
reliable indications show that girls are more advanced at all ages from 
birth to maturity. 

2. The ossification ratio, as defined and used in our investigation, 
is a reliable index of anatomical maturity. This is shown by the con- 
stant manner in which the mean of the distribution of the ossification 
ratios increases from year to year. Also by other facts, as for example 
the relative development of boys and girls. 








ON THE INFLUENCE OF EDUCATION ON INTELLI- 
GENCE AS MEASURED BY THE BINET- 
SIMON TESTS 


DAVID WECHSLER 


The Psychological Corporation, New York 


The object of this paper is to present the results of a study on the 
variability of intelligence, as measured by Binet mental ages, with 
increasing natural age, and to show what light these results throw upon 
the problem of the influence of education on intelligence. 

Experimental attack on the question of the influence of education 
on intelligence as measured by the Binet-Simon scale, has been hitherto 
primarily along two lines. The first approach has been through the 
method of partial correlations. In this method the procedure has 
consisted of obtaining measures of educational attainment, and of other 
possible contributing factors, in conjunction with intelligence ratings 
as furnished by the tests; and then inferring the contributing influence 
of each of the factors measured upon the intelligence ratings, from the 
nature of the final partial correlation ratios obtained. 

The second line of approach has been through the comparison of 
successive intelligence ratings obtained from retests. Here the proced- 
ure has consisted of examining the same subjects with the same 
scale after different intervals of time, and noting whether or not there 

were significant differences between the successive [Q’s so obtained. 

The most thorough study of the influence of education on intelli- 
gence by the method of partial correlations is that made by Burt. 
Complete analysis of this study is to be found in his book on Mental and 
Scholastic Tests (pp. 180-183). I shall here only briefly summarize 
Burt’s method of procedure and general results. 

In his investigation Burt obtained the following data on some 
300 children between the ages of 7 and 14 years: (1) The child’s age; 
(2) his scholastic attainment as measured by educational tests and 
corrected by teachers’ estimates; (3) his native “intelligence” as 
measured by a special reasoning test and his teacher’s estimate; (4) 
his mental age as obtained with Burt’s revision of the Binet-Simon 
tests. Having these data Burt next correlated the ratings obtained 
with each of these measures, and then by the method of partial correla- 
tion eliminated successively one or more of the “‘factors’”’ as contribut- 
ing influences to the mental age score. 
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The results of the intercorrelations were as follows: Binet mental 
age and school work correlated to the extent of .91; Binet mental age 
and intelligence (as measured by the reasoning tests) .84; school work 
and intelligence .75; Binet mental age and school work after both intel- 
ligence and chronological age were eliminatd .61. This last coefficient 
was the highest of any of the coefficients of the second order coefficients 
obtained (z.e., those obtained after any two factors among the various 
combinations had been eliminated). 

In view of the high value of the coefficient of correlation between 
Binet mental age and school attainment .91, and its persistent high 
value, .61, after the factors of intelligence and age were eliminated, 
Burt concludes: ‘‘There can therefore be little doubt that with the 
Binet-Simon Scale a child’s mental age is a measure not only of the 
amount of intelligence with which he is congenitally endowed, not only 
of the plane of intelligence at which in the course of his development 
and growth he has arrived; it is also an index largely, if not mainly of 
the mass of scholastic information and skill which, in virtue of attend- 
ance more or less regular, by dint of instruction more or less effective, 
he has progressively accumulated at school.’’ (Loc. cit., p. 182.) 

Excepting the words, “largely or mainly,” the above conclusion 
appears to me a valid inference, providing one accepts Burt’s reasoning 
test as a true measure of native intelligence. The assumption that it 
is such might, of course, be questioned but perhaps no more effectively 
than any other single criterion. It must furthermore be borne in 
mind that the reasoning test ratings were combined with teachers’ 
estimates of the child’s intelligence. Taken as a whole, Burt’s data 
show that intelligence as measured by Binet mental age ratings is 
unquestionably influenced by academic education. What they do not 
show is precisely how great this influence is, and to what extent if 
any, the diagnostic value of the Binet-Simon tests is reduced thereby. 

Let us now consider the second array of evidence from which con- 
clusions regarding the influence of education on intelligence have been 
drawn. This evidence, as already mentioned, is derived from studies 
on the constancy of the IQ, and consists of data furnished by retests of 
individuals with the same scale after varying intervals of time. The 
results of such studies made with the Stanford revision of the Binet- 
Simon tests in the United States have been recently summarized by 
Colloton and Rugg,! and with Burt’s revision in England by Gray 
and Marsden.? 





1 Journal of Educational Psychology, 1921, 12, pp. 315-322. 
2 British Journal of Psychology, Gen. Sect., 1924, 15, pp. 169-173. 
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Upon analyzing the results of the various reports reviewed, Rugg 
and Colloton found that the average difference in IQ between first and 
second testings was approximately 4.5 points for the middle 50 per cent 
of the groups tested. This difference was not far from the average 
figures reported by most of the investigators individually, but was not 
distributed equally in either direction. These authors summarize 
their findings thus: “ For all studies the positive differences are nearly 
twice as large as the negative, the typical positive difference being 
somewhat less than six points and the typical negative difference 
approximately three points.”” The magnitude of these differences 
have been shown by Terman to be not much greater than those 
obtained by immediately succeeding retestings, that is tests made 
within a day or two of each other instead of after intervals of one, two, 
or three years as in the studies reviewed by Rugg and Colloton. 

The results of Marsden and Gray were very much the same as 
those of Rugg and Colloton. They report: ‘The middle 50 per cent of 
changes in IQ lie between the limits of 5.1 points decrease and 6.0 
increase,”’ or, in terms of the probability of the IQ not varying beyond 
certain limits, from one examination to another, “‘the chance of an IQ 
showing an increase by as much as 5.5 points is 1 in 5; as much as 11 
points 1 in 5; as much as 16.5, 1 in 20; as much as 22 points, the chances 
are 1 in 100.’’! 

The general conclusion of Gray and Marsden and of the other 
authors cited is that an individual’s IQ remains fairly constant, and 
that the results obtained from retesting of subjects after varying inter- 
vals of time show that the changes in IQ which do occur are not 
significant. The first part of the conclusion, I believe, is warranted by 
data reported; the second part, it seems to me, is a very doubtful 
inference. 

The studies on the constancy of the IQ have usually been under- 
taken with the view of ascertaining the prognosticating validity or 
diagnostic value of the IQ rating. In the light of this criterion, results 
of the retest investigations seem to indicate quite clearly that the IQ is 
sufficiently constant to assure the examiner a reasonable amount of 
confidence in the exactness of his rating. ‘The chances are good that 
if the same individual were tested on a different occasion or at a sub- 





1In the case of two investigators (Stenquist and Fermon) these differences 
were considerably larger. Rugg and Colloton are inclined to these exceptional 
results as having been ‘‘caused primarily by the non-uniform scoring of the 
responses.” 
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sequent date, with the same scale, his intelligence rating would be 
substantially the same. Nevertheless, measurable differences are 
found in the majority of the cases and in many of the cases these differ- 
ences are rather large. The chances are only even that an individual’s 
IQ will not vary by more than 5.5 points (Gray and Marsden) at two 
successive examinations. And this difference must be doubled if we 
wish to increase the range of possible variability so as to include 
83 per cent instead of only 50 per cent of the cases. Now, while this 
margin may not seem too wide for practical diagnostic purposes, it 
does seem too large a quantity to be lightly passed over. The varia- 
bility of 11 points in 100 (the theoretical mean IQ of the average normal 
child) within a range of 2 PE’s is hardly an insignificant variation and 
certainly seems to need accounting for. Some effort to do so has, 
to be sure, been made. It has been explained, for instance, as being 
due to the influence of practice, the inexpertness or difference in 
methods of scoring of different examiners, and the varying emotional 
adaptation on the part of the subject on the different examinations. 
But the evidence is not clear that these are the only or even the most 
important influences, and whether there be not some other constant 
factor or factors which contribute to the magnitude of the variation. 

The factor whose presence or absence we are particularly desirous 
of ascertaining is that of education or academic training. Unfor- 
tunately the studies on the constancy of the IQ do not permit us to 
conclude anything definite on this question owing, in the first place, 
to the limitations of the data themselves, and, in the second place, to 
the disturbing influences inherent in the method of retesting, such as 
the effect of practice, variations of emotional attitude on the part 
of the subject at different examinations, etc. I shall present a method 
of,approach to the study of variability of intelligence ratings, which 
seems to eliminate these difficulties, and which, I believe, permits the 
attack of the problem of the influence of education on intelligence in 
& more rigorous way. 

The method about to be considered consists of analysis of the range 
of variability of intelligence as measured by Binet mental age, at the 
different chronological year levels over which the Binet-Simon scale 
extends. The theory of this procedure is that if education influences 
the intelligence ratings, the variability of these ratings should be modi- 
fied as the individuals tested have been more or less subjected to that 
influence, that is as they grow older. If the effect of education is to 
make us more alike, the variability should become smaller with increas- 
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ing age; if it tends to make us more unlike, the variability should 
become greater as the individual grows older. On the other hand, if 
the range of variability should be found to remain approximately 
constant, the influence of education might be said to be negligible. 

In order to answer the above questions about any particular 
scale it is, of course, necessary to have original distributions of mental 
age ratings for each of the chronological years embraced by the 
scale employed.! The next step is to obtain the mean mental ages 
and their standard deviations for the different chronological age levels. 
These means and standard deviations calculated from the original 
distributions of both the Terman and Burt revisions of the Binet 
Scale are given in columns (3) and (4) of TablesI and II. (The year 
levels below 6 and above 14 were omitted from consideration, owing 
to both the diminished reliability and the inherent limitations to 
variability at the extreme ends of the scales.) Having the means and 
standard deviations it is, of course, an easy matter to obtain an 
expression of variability by dividing the latter by the former and multi- 
plying the quotient by 100 (Pearson’s coefficient of variability). The 
coefficients of variability thus obtained together with their probable 
errors are given for each chronological year in columns 5 of Tables 
I and II. 

Inspection of the coefficients reveals that, with some exceptions, 
their absolute values tend to become smaller and smaller as the ages 
of the subjects increase. This fact appears more outstandingly in 
Burt’s than in Terman’s data. It is, however, not possible from 
inspection of the figures alone to evaluate the nature of the variability 
indicated. The question is whether the differences revealed are 
significant. The answer to this question can be obtained by calculat- 
ing? the probable errors of the differences between the coefficient of 





1] wish to express my indebtedness to both Professor Terman and Dr. Burt for 
furnishing me these data. 

2I am indebted for these computations to Mr. B. Malzberg, statistician to 
the New York State Board of Charities. 

The formula used for the calculation of the PE’s of the differences is: 


PE of C1 — Cn = V e7? + ez? 


where C1 and C1 are the coefficients of variability and e; and e;; their respective 
probable errors. The PE of the coefficient of variability is given by the formula, 


67449 V 1+ 2( sta) ia 





PE of CV = 


V2N 


where V is the coefficient of variation and N the number of cases. 
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variability of each chronological age and that of every other chrono- 
logical age in the group. This has been done in Tables III and IV. 
A difference is considered significant when it is equal to three times 
its probable error. Where such is the case, the figures in the table are 
given in bold face type. 

The results are extremely interesting. In neither Burt’s distribu- 
tions nor Terman’s is there any significant increase or decrease of 
variability between the ages of 6and 9. In Burt’s data this continues 
to be the case through year 10. But beginning with year 10 in Ter- 
man’s distributions and year 11 in Burt’s, significant differences begin 
to appear. In Burt’s distributions there is a very sharp change at this 
point. From here on nearly all coefficients of variability show 
significant differences. The coefficients of variability become smaller 
with increasing age. The PE’s of all the differences are significant 
with the exception of those between years 10.5 and 11.5, and 12.5 
and 13.5. Out of 26 differences (comparing the coefficients of year 10 
through 14 with each other and with those of every other year in the 
table) 2 are insignificant and 24 are significant and positive.' 

Terman’s distributions show no uniform changes of variability 
until we reach the age of 14. Until this age is reached the significant 
differences encountered are more orlesssporadic. There isasignificant 
difference between the year 10 and 6, year 11 and year 9, year 13 and 
years 7and 10. Year 14, on the contrary shows significant differences 
when compared with all the other years in the table except years 10 and 
9. Out of 30 differences compared 21 are insignificant and only 9 are 
significant, and of these some are negative. 

There is thus a discrepancy in the results obtained from the data 
of the two revisions of the Binet-Simon Scales. Terman’s data indi- 
cate no significant change of variability in intelligence with mental age; 
Burt’s data, on the contrary, indicate a very clear cut alteration in 
variability, the change occurring at a very definite point in the course 
of the child’s development. We have reason to ask ourselves, there- 
fore, which of the data are to be accepted and the possible reasons for 
the discrepancy revealed. 

In the opinion of the writer, the results obtained from analysis of 
Burt’s data are more to be trusted, and for the following reasons: In 





1 A difference is positive when the coefficient of variability of the higher chrono- 
logical year is smaller than the coefficient of the lower chronological year, for 
instance when the coefficient of year 12 is less than that of year 11. Positive 
differences thus indicate decreased variability. 
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the first place, Burt’s data were obtained on the average from about 
twice the number of subjects. The Stanford Revision is based on less 
than 100 cases per year level, roughly in equal distribution between 
each sex. Burt’s standardization, on the other hand, is based on 200 
cases per year level (roughly, 100 boys and 100 girls). In the second 
place, the differences of variability revealed by the analysis of Burt’s 
data is so clear cut that it is extremely unlikely that they are due to 
chance. The only alternative explanation is that these differences 
are due to the operation of some constant error or errors, but of these 
there is no evidence. Finally, and most important, is the fact that 
the point at which the significant differences begin to appear is precisely 
that at which we should expect it, if education did influence mental 
age ratings, namely when formal academic instruction begins to be the 
important source of the child’s intellectual acquisitions. 

If we compare the contribution which academic training makes 
toward the mental equipment of the child, with that which he acquires 
through natural growth, it is not hard, I think, to show how relatively 
small the former is. Compare, for instance, a child’s learning to read 
with his acquisition of the fundamental concepts of language, meanings 
of words, etc., in the course of his natural development. Or, again, 
the motor coordinations in learning to write with these involved in 
speech or general orientations of the body. The first few years of 
formal education, indeed, emphasize the training of the native sensory, 
perceptual and conceptual powers of the child. The acquisition of the 
three R’s to which the major portion of the child’s first four years of 
schooling is devoted consists primarily in a systematic training of 
these powers. Learning to read consists primarily in the formation 
of certain visual and motor percepts and the acquisition of certain 
fundamental concepts; learning to write of acquiring some specialized 
neuro-muscular habits; in arithmetic attempt is made to give the child 
the fundamental concepts of number and numerical relations. 

The early years of the child’s education then even in the school are 
concerned with the training of his sensory perceptual and elementary 
conceptual powers. In the average school system this instruction is 
emphasized for approximately the first four years of the school curricu- 
lum. Beginning with the fifth year more and more emphasis is 
begun to be placed on the acquisition of facts, information and 
cultural material. History, geography, literature, science are given 


progressively more and more time. The child begins to acquire 
knowledge. 
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Now, analysis of Burt’s data shows that from the age at which the 
acquisition of knowledge begins to play an important role in the 
schooling of the child, ascompared with the training and development of 
his perceptual and conceptual powers, the influence of formal education 
on mental age ratings begins to make itself felt more and more. In 
short, the acquisition of knowledge, or what we shall now use as a 
synonomous term, education, seems to have a significant influence on 
intelligence as measured by the Binet-Simon tests. The influence of 
education is to make us less variable, that is, to make us more alike. 


SUMMARY AND GENERAL CONCLUSIONS 


In this paper the writer has attempted to attack the problem of the 
influence of education on intelligence as measured by the Binet-Simon 
tests through a study of variability in mental age with increasing 
chronological age. This was done by calculating the coefficients of 
variability of mental age for each of the chronological year levels from 
years 6 through 14 for both the Terman and Burt revision of the 
Binet-Simon scales, and then examining the reliability of the differences 
found. The theory behind the procedure was that if education 
influences intelligence, the variability of intelligence ratings obtained 
with the scale should either increase or, more probably, decrease with 
increasing chronological age. 

The results of the above analysis showed that no significant dif- 
ferences in the coefficient’s variability were revealed by either scale 
up to the years 10-11. From this point on however, Burt’s data 


TABLE I.— DISTRIBUTION AND VARIABILITY OF MENTAL AGEs (TERMAN’S REVISION 
oF BInet-Simon TEstTs) 




















1 2 3 4 5 
Chronological Number of Mean mental Standard Coefficient of 

age cases age in (months) deviation variation 

5 56 60.7321 7.8478 12.92 + .5921 

6 118 74.4492 9.5718 12.86 + .4058 

7 93 86.4516 10.5346 12.18 + .4324 

8 98 98 .0204 12.4260 12.67 + .4388 

9 113 107 .7522 12.9475 12.02 + .4526 
10 86 124.4302 13.7465 11.05 + .4225 
11 79 134.1646 18.5182 13.80 + .5336 
12 82 142.3049 18.4481 12.96 + .4908 
13 98 150.5306 20 .8702 13.86 + .4814 
14 82 163 .7439 17 .3373 10.59 + .3975 
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showed uniformly significant differences of variability; while Ter- 
man’s distributions failed to show any significant differences except for 
year 14. In both cases, wherever the differences did occur, they were 
nearly in all instances in the direction of a decrease in variability with 
increasing chronological age. 

The writer is inclined to the view that the failure of Terman’s data 
to show the phenomenon revealed by Burt’s data is due to the 
insufficiencies of the Stanford Standardization of the Binet-Simon 
Tests. 

Giving preference to Burt’s data over those of Terman, the writer 
interprets the significant decrease in variability of mental age with 
increasing chronological age (from years 11 through 14 inclusive) as 
evidence of the influence of education on intelligence as measured 
by Binet-Simon mental age ratings. The absence of any significant 
differences below the age of 10 is interpreted to show that below this 
age the mental rating of the Binet-Simon tests are not influenced by 
education. 

In general the data on MA variability show that education is 
less of factor in younger than in older children, as would naturally be 
expected, and that towards the lower end of the Binet scale (years 6 
to 10) is probably rfegligible. If this is so, there follows the extremely 
interesting, though, at first thought, apparently paradoxical corollary, 
that the intelligence rating (mental age, or for that matter, IQ) 
obtained with the Binet-Simon scale of a child at an early age (8 to 
10 years) more nearly represents its potential native equipment than 
one obtained when he is much older (14 to 16). This would probably 
hold true for any other scale that might be used. 


TaBLe IJ].—DIstTRIBUTION AND VARIABILITY OF MENTAL AGEs (Bart’s REVISION 
oF Binet-Srmon TEstTs) 








Chronological | Approximate | Mean mental Standard Coefficient of 
age number of cases} age (in years) deviation variation 
5.5 200 5.5 0.59 10.73 + .4008 
6.5 200 6.5 0.83 12.77 + .4376 
7.5 200 7.6 1.03 13.55 + .4653 
8.5 200 8.6 1.15 13.37 + .4589 
9.5 200 9.4 1.24 13.19 + .4525 
10.5 200 10.6 1.29 12.15 + .4161 
11.5 200 11.4 1.26 11.05 + .3772 
12.5 200 12.3 1.18 9.59 + .3264 
13.5 200 13.1 1.20 9.16 + .3115 
14.5 200 13.8 1.09 7.90 + .2797 
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TaBLE III.—DirrgerRENces IN VARIABILITY OF INTELLIGENCE (MENTAL AGE) 
RattnGs aT DIFFERENT CHRONOLOGICAL AGE LEVELS (BASED ON AN ANALYSIS 
or TERMAN’s Data) 








Yrs. 7 8 | 9 | 10 11 12 13 14 
6 |.68+ .593!|.91+ .504|.84+ .608)1,.81+ .586) .94+ .640/1.0 + .637/1.0 + .630/2.27 + .568 
T de cccce sows .49+ .613|.06 + .626)1.13+ .605)1.62+ .687| .78+ .654/1.68+ .647/1.59+ .687 
B fo cccccccccdocccccccele 65 + .627|1.62+ .606)1.13+ .688) .29+ .655)1.19+ .648/2.08+ .588 
DO de ciwey sc vicchessctqrcebiicssac een -97+ .619)1.78+ .7 04+ .668)1.84+ .683)/1.43+ .602 
BD Foc cccccccchovcctoscgheecvcccceleccvesseee 2.76+ .681|/1.91+ .648/2.81+ .641) .46+ .580 
BE fo ccccccwccfecccceccoleccccdobolsccwoscccoiccceccccce 84+ .653) .06+ .719|3.21+ .655 
BB bvicc cccpicd olocovccdvshdsrccerdalvdercvwccsscceccocessicvooncesse 90+ .687/2.37+ .632 
BB foccccccccclcccevedechecccosccelesecdscceshovecececesiocccesecceiscsccccess 3.27+ .6%4 
































1 Interpret as follows: The difference in variability between ages 6 and 7 years is .68 + 593; 
between ages 6 and 8, .19 + .594, etc. etc. 


TaBLeE IV.—DirreRENCES IN VARIABILITY OF INTELLIGENCE (MENTAL AGE) 
RaTInGs AT DIFFERENT CHRONOLOGICAL AGE LEVELS (BASED ON AN ANALYSIS 
or Bart’s Data) 











Yrs. 7.5 8.5 9.5 10.5 11.5 12.5 | 13.5 | 14.5 

6.5 |.78+ .639| .60+ .634/.42+ .629) .62+ .604|/1.72+ .5678/8.18+. 3.61+ .637|4.87+ .619 
CoB Fe ccccdcscls 18+ .654| .36+ .649)1.40+ .624/2.50+ .599/3.96+. 4.39+ .560/6.65+ .542 
Bald Be cc cccccchecscccccele 18+ .644/1.22+ .619|2.32+ .593\3.78+ .563/4.21+ . 5.47+ .637 
Bee Pecncdccicls cocscvccclececescss 1.06 + .615/2.14+ .614)3.60+ .589/4.03 + .649/6.29+ .532 
BIS Ba cp cece cslewcnaccosloccecscbehise cbacces 1.10+ .561/2.56+ .629/2.99+. 4.26 + .501 
BRcW Pe cccvccccleccccsccdlesveseecditecascoeseloncceecans 1.46 + .499|1.89+ .489/3.15+ .469 
Be Bs'be een veloscacs cCUESs deeevahedes cibecehoccccccceetoscenceses .43+ .451/1.69+ .429 
DBP he 6s wna. ss hs 60 60 0nbehee «60> Geka en pil<d 4 ihi6n040sored moves ope dhcenstee bos 1.26+ .419 
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THE STANDARD AND PROBABLE ERRORS OF 
MEASUREMENT 


C. W. ODELL 


University of Illinois 


With the recent proposal to use the standard or probable error of 
measurement as a measure of reliability the writer is in full accord. 
He wishes to point out, however, that several different formulas by 
which to compute the errors of measurement have been suggested, 
and that in many cases these do not give the same results. Some 
persons using these formulas appear to have assumed that the only 
cause for the difference in the results obtained by the various formulas 
was that the data being dealt with usually do not form a normal 
frequency distribution. Although it is true that this does cause some 
of the differences, it is not responsible for all or even most of them. 
The purpose of this article is, therefore, to point out the differences 
between several of the more commonly used formulas and to comment 
on their meaning. 

Before proceeding to do this, it will perhaps be well to recall that 
the standard and probable errors of measurement are measures of the 
errors which exist in a set of scores or measures obtained by giving 
one form of a test. In other words, they measure the differences 
existing between the scores actually obtained by giving one form, and 
the theoretically true scores. To calculate them, however, two 
series of actual scores are required. 

Probably the most commonly used formula for the standard error* 
of measurement iso+/1 — r. Thisis merely the formula for the stand- 
ard error of estimate modified so that it measures the errors between 
a set of obtained scores and the true scores, instead of those between 
two sets of obtained scores. Since in any given case there are two 
series of obtained scores, or sometimes more, and one usually desires 


an average measure of the errors involved in using any one of them, 


1 A theoretically true score may be defined as the average of an infinite number 
of obtained scores corrected for any constant errors. For practical purposes, it 
may be taken as the average of as many scores as are available. 

2 From now on, only the standard error of measurement will be discussed. 
Since the probable error of measurement equals .6745 times the standard error of 
measurement, whatever is said about the latter holds for the former except for the 
necessity of multiplying by the decimal just given. 
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and furthermore since one usually does not know which series most 
nearly approximates the true scores, the, standard deviations are 


commonly averaged. Doing this, the formula becomes 2 as Ty 
V1 —r. 
In addition to this formula, three others will be considered. These 


— vy)? 
are as follows: Vay [2x - oe Se V Y) |, 707 louis. 


and .8863 mean difference. The first of these three contains no terms 
not directly available from the tabulations of the distributions of 
scores of the two series. In the next one, cgig is the standard devia- 
tion of the differences between the scores in the two series. The mean 
difference used in the last formula is also that between the scores in 
the two series. 

The accompanying problem has been given to illustrate the com- 
putation of the standard error of measurement according to these 
four formulas for the scores of 25 individuals upon two applications 
of an individual intelligence scale. The first two columns in the table 
contain the scores themselves, the next two their deviations from the 
means, the third pair the squares of those deviations, the seventh the 
products of the corresponding deviations, the eighth the differences 
between the corresponding scores, and the ninth the squares of these 
differences. The last three columns will be referred to later. By 
using the means, the standard deviations, the coefficient of correlation 
and the sums of certain of the columns, the values of the standard 
error of measurement according to the four formulas may be obtained. 
It will be seen that these values are 5.75, 6.52, 7.11 and 6.81, thus 
showing rather large differences. 

To explain why these differences exist and just what each standard 
error of measurement measures, it is necessary to recall that the errors 
present in measurement may be either constant or variable, and 
furthermore, that the latter may be of two kinds. They may be due 
either to imperfections in the measuring instruments themselves, or 
to chance occurrences which have no connection with the tests. 
Imperfections of the first sort bring about the result that two sup- 
posedly duplicate and equal forms of a test do not yield the same 
results under identical conditions. The chance occurrences referred 











1In actual practice conditions are never absolutely identical. It is necessary, 
however, to recognize that even if they were identical, no tests, at least none so far 
constructed, would yield identical results when applied more than once. 
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MA in 
mantis di de di3 de® | didz | di-2 | di22 Si’ | di-Si’|  dit-Si’ 
First | Second 
72 68 | —34 | —42/ 1,156) 1,764) 1,428 4 16 71.18 . 82 - 6724 
74 76 | —32 | —34 | 1,024) 1,156) 1,088 2 4 77.81; 3.81 14.5161 
78 80 | —28 | —30 784 900 840 2 4 81.13) 3.13 9.7969 
80 90 | —26 | —20 676 400 520 10 100 89.42) 9.42 88 . 7364 
84 84 | —22 | —26 484 676 572 0 0 84.44 44 . 1936 
84 90 | —22; —20 484 400 440 6 36 89.42) 5.42 29.3764 
88 96 | -—18 | —14 324 196 252 8 64 94.39) 6.39 40.8321 
90 84 | -16 | —26 256 676 416 6 36 84.44) 5.56 30.9136 
94 98 | —12/; —12 144 144 144 4 16 96.05; 2.05 4.2025 
96 88 | —10 |; —22 100 484 220 8 OF 87.76) 8.24 67 . 8976 
98 92; — 8; —18 64 324 144 6 36 91.08; 6.92 47 .8864 
104 108 | — 2; — 2 A 4 4 4 16). 104.34 34 . 1156 
106 102 0;-8 0 64 0 4 16 99.37) 6.63 43.9569 
108 132 +2); +22 4 484 44 24 576) 124.24) 16.24) 263.7376 
112 100 +6 | —10 36 100; —60 12 144 97.71) 14.29) 204.2041 
114 120 +8 | +10 64 100 80 6 36; 114.29 .29 . 0841 
116 128 | +10 | +18 100 324 180 12 144; 120.92; 4.92 24. 2064 
120 118 | +14/| + 8 196 64 112 2 4| 112.63) 7.37 54.3169 
122 128 | +16 | +18 256 324 288 6 36; 120.92) 1.08 1.1664 
126 140 | +20 | +30 400 900 600 14 196; 130.87) 4.87 23.7169 
128 134 | +22 | +24 484 576 528 6 36; 125.90) 2.10 4.4100 
132 128 | +26 | +18 676 324 468 4 16; 120.92) 11.08) 122.7664 
132 148 | +26 | +38 676) 1,444 988 16 256; 137.51) 5.51 30.3601 
144 170 | +38 ; +60 | 1,444) 3,600! 2,280 26 676; 155.75; 11.75) 138.0625 
148 148 | +42 | +38 | 1,764) 1,444) 1,596 0 0} 137.51) 10.49) 110.0401 
2,650 | 2,750 11,600) 16,872)13,232; 192 | 2,528)2,650.00)149. 16)1,356. 1680 
M 106 110 o? 464 674.88 -—60= 
o 21.54 25.98 13,172 
13,172 
ae 
”" 31.54-25.08 ~ “9415 
- 2 + WY 1 — 9415 = 5.75 om = .7071 [2528 = 7.11 


2 





i 
Cn = = 4 om 
Nasal 2528 


21.54 
25.98 


(2650 — 2750)?] . 6.52 





21.54 
, quan. 
S;’ = =—-S; + 106 — 25.98 


on = 21.544/1 — 9415 = 5.21 


25 


25 


on = 8863.2" = 6.81 


25 


110 ~ .8291S: + 14.80 





om = Ness! 1356.168 — 


(2650 — 





[ 


a 
2650) ] = 5.21 


.168 
om = 707 YEE «5.9 


149.16 


om = .8863-5,— = 5.29 


25 





to above are due to such facts as that one pupil breaks a pencil point, 
another is distracted by some noise or by something he sees outside 
the window, another copies from someone else’s work, and so on. 
The first two formulas measure the variable errors present, including 
those of both kinds, whereas the last two measure all of the errors, 
both variable and constant. This can easily be seen by looking at the 
formulas. The first depends upon the values of the standard deviation 
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and of the coefficient of correlation, which are not affected by constant 
errors. That is to say, if a given amount is added to or subtracted 
from each of the scores in one or both of the series, no change is made 
in the coefficient of correlation or in the standard deviations. 


61 Although the second formula is based upon the differences between 
= the scores and these are affected by any constant errors present, the 
36 constant error is allowed for and its effect eliminated by the formula.' 
If the two distributions have equal standard deviations, the values 
36 of the standard error of measurement obtained from the first and 
- second formula will agree. If the two distributions are normal and 
64 also have equal means, or in other words, if there is no constant error 
™ or difference, the third formula also will yield the same result. The 
76 last one gives the same value only in case the differences between the 
" corresponding pairs of measures are themselves normally distributed. 
54 The procedure shown by the last three columns at the right and 
~ the last three lines at the bottom of the table may be used to secure a 
39 measure of the variable errors actually due to imperfections in the 
. measuring instrument, and to exclude those due to accidental occur- 
1 rences. ‘The scores in one series are expressed in terms of the other 
. series by means of an equivalent score equation. The one used in 
0 this case is S,! = Se+ M, —- = M:. By substituting the proper 


quantities and reducing, it is found that to secure the first series 
scores equivalent to those in the second series or, in other words, to 
express the second series scores in terms of the first, each score in the 
second series must be multiplied by .8291 and have 14.80 added to it. 
So doing gives the entries in the column headed S;'._ The next column 
contains the differences between these equivalent scores and the first 
series scores, and the last column the squares of these differences. 
Making use of the equivalent scores instead of the original Series 2 
scores, it will be seen that the first three equations yield the same value, 
5.21, of the standard error of measurement? but that the fourth yields 


1 This is accomplished by the subtraction of the last term in the formula. The 
’ numerator of this term is the difference between the sums of the two series of scores; 
p therefore allowing for this difference, and subtracting it, corrects for the difference 
between the means of the two series, or in other words, for the constant error. 

2 Since each equivalent score is a Form 2 score multiplied by a constant factor, 
and with a constant amount added, the coefficient of correlation between Series 1 
and the equivalent scores is the same as that between Series 1 and 2. Also since 
the equivalent scores are expressed in terms cf Series 1, their standard deviation 
is the same as that of Series 1. 
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a slightly different value. This difference is due to the fact, mentioned 
above, that the distribution of the differences between the scores in 


the two series is not normal. The standard error involved is 5.21 


(or 5.29) in using Series 1 scores. That involved in using Series 2 
scores may be found by expressing those of Series 1 in terms of Series 2 
and proceeding as above. 

Although use has been made of equivalent scores, this is not neces- 
sary but has been done merely to make clear the distinction involved. 
It is possible to secure the same result, that is, to secure the value of 
the standard error of measurement when one series is expressed in 
terms of the other, without actually going through this operation. 
Attention has been called to the fact that the standard deviation of 
the first series and that of the equivalent scores in terms of the first 
series, isthesame. Therefore, all that one needs to do is to make use of 
the formula o+/ 1 —r and substitute for o the standard deviation of 
the series in terms of which the other series has been expressed. The 
resulting standard error is the error involved in using the scores of 
this series. If, instead of expressing Series 2 in terms of Series 1, 
the reverse were done, one would then use the standard deviation of 
Series 2, that is 25.98, instead of 21.54, and the result would give the 
standard error involved in using the Series 2 scores. For this standard 
error, we have 25.98+/1 — .9415 = 6.28.! 

To summarize, the significance of the standard errors of measure- 
ment obtained in the accompanying example is as follows: 5.21 
(or 5.29) is the standard error) of the imperfections in the test itself) 
involved in using Series 1 scores; 6.28 is the similar standard error 
involved in using Series 2 scores; 5.75 is the average standard error 
of the two series of all the variable errors, including those due to both 
imperfections in the test itself and accidental occurrences; 6.52 is a 
measure of the same thing, but differs from the last value given because 
the standard deviations of the two distributions are not the same; 
7.11 (or 6.81) is the standard error of all the differences between the 
scores in the two series, both constant and variable. 








1 It will be noticed that, of course, the standard error of measurement given by 
the first formula when applied to the two original series is the average of the two 


standard errors obtained for the two differences. That is 5.75 = 6.31 + 6.28. 
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STUDIES OF ACHIEVEMENT TESTS 


BEN D. WOOD 
Columbia College, Columbia University 
(Continued from February Issue) 


Part III 
SPEARMAN-BROWN RELIABILITY PREDICTIONS 


It is obvious that the correlation averages of Tables I-IV make 
possible an empirical test of the veracity of the reliability prophecies 


NT 11 I 
n+(n—1) ru 7 
view of the fact that the entries for the small groups of questions, the 
10’s, the 20’s and 30’s, are the averages of from 3 to 16 coefficients 
of correlation, the test of the validity of the S-B formula here 
presented may be considered as having more weight than similar 
empirical studies depending on series of single r’s for each cumulation 
of examination components. 

It is also to be noticed that the conditions here are closely similar to 
those implied in the assumptions underlying the formula, as described 
by Spearman and Brown. The components in all cases consist of 
questions as similar as purely random selections from a single type of 
questions could be, and the increments in all cases consist of exactly 
equal numbers of such similar questions. 

The general result is that if we start with the empirical reliability of 
a component of 20 examination elements the S-B formula very closely 
predicts the empirical reliabilities up to 100 examination items in 
about half the cases and underestimates them in the other half 
of the cases by from .04k to .10k. (k = +/1 — r*) 

These results need not necessarily be considered as in conflict with 
other similar studies in which variant conclusions were reached. The 
reader must remember that these data are based entirely on 7-F 
questions of certain specific types administered under particular 
conditions to particular groups of students, as described. 

Since the reader is already familiar with the origin and nature of 
our data we shall present the results with a minimum of explanation. 
Charts 12, 13 and 14 present the facts for the 100-item T-F test of 
comprehension of French sentences. 
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afforded by the Spearman-Brown formula r,, = 
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Charts 12 and 13 both relate to the Number Right scores of the 
French test; but in 12 the S-B “take off”’ is from rio-10 while in 13 it is 
from reo-20. In 12 the S-B underestimation of the reliability of 50 T-F 
items is about 0.10 in terms of k, but in 13 it is only a little over 0.06k. 
If the starting point were rso-30, the error of prediction would become 
practically negligible. In Chart 14, however, the underestimation 
starting from the reliability of 10’s, 20’s or 30’s is clearly about 0.10k 
at 50 items, and the trend of the curves indicates a still greater under- 
estimation above 50 items, although it would be incautious to accept 
this indication at face value. 

Charts 15 and 16 present the facts for the 200-item 7-F examination 
in the law of Real Property. The S-B predictions here are fairly 
close to the empirical findings, although the underestimation of the 
reliability of the R-W scores is consistent from 10 to 100 items. The 
slight crossing of the lines in 15 does not conceal the close agreement 
in the trends; it does afford warning, however, against making sweep- 
ing generalizations on the basis of single r’s and of differences which are 
well within one or two PE,. Each of the last two points on the empiri- 
cal curve here represent the average of four r’s; yet, in spite of this 
decrease in the PE of the location of the points, we have what at first 
sight would seem a significant break in trends at rgo-g0; so much so 
that if it had been impossible to carry the empirical verification to 
T100-100, Chart 15 might have been cited as an instance of clear over- 
estimation by the S-B formula. 

Charts 17-19 show the facts for the 180-item 7'-F examination in 
Pleading and Practice. Chart 17 needs no comment, and the over- 
estimation of the S-B formula in 18, where the “take off”’ is from rjo-10 
disappears entirely in 19, where the start is from rgo-30. The exactness 
of S-B prophecies here reaches perfection, although we have a kink 
in the empirical curve of 17 at r4o-40, in spite of the fact that this point 
is determined by the average of six different r’s of 40 against 40. The 
computations were checked and rechecked. 

The facts for the 140-item 7-F test in Equity, presented in Charts 
20-21, should be interpreted in the light of the defects of this examina- 
tion already reported. It seems clear, however, from both these charts, 
that the S-B formula underestimates the reliability of R and R-W scores 
above 740-40, but conforms closely to the empirical results from rio-10 


to rgo-40. The dash-line in Chart 20 is a suggested smoothing of the 
empirical curve. 
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These data show that in two out of four true-false tests Spearman- 
Brown predictions conform closely to the empirical reliabilities, both 
when the score is the Number Right and when it is R-W. There is 
little suggestion of overestimation up to rioo-100; and, if we may be 
permitted to make inferences from the prevailing trends of the curves 
at or near rioo-100, We may say that there is little suggestion of over- 
estimation by the S-B formula above 100 items. Except where the 
agreement is very close, underestimation seems to be the prevailing 
tendency. But these indications apply only to one form of question, 
the true-false form; that we cannot generalize from the results from 
one form of question is indicated by Chart 22. 

Chart 22 may be taken as an illustration of the fact that different 
forms of questions behave differently with respect to rate of increase 
in reliability with increase in the number of test elements. This chart 
should be compared especially with Charts 12-14, since the tests repre- 
sented in 12, 13, 14 and 22 were parts of the same examination and all 
four charts involve results from the same 100 students. The test 
represented in 22 is a 100-item recognition (5-choice) test of French 
Vocabulary. Here the S-B formula, starting from rjo-19, overestimates 
rs50-s0 2S Much as it underestimates rso-59 in Charts 12-14. 

The variability of the r’s calculated for this study affords striking 
proof of the necessity for considering carefully the PE’s of coefficients, 
and of the dangers of generalizing from series of single determinations 
of the reliabilities of various cumulations of components. If we had 
made these charts on the basis of, say, the first determinations of 
20-20 T30-30 etc., but using the average rio-10 as the S-B “take off,”’ 
our results would have been choppy. A concrete idea of the shakiness 
of our chart-lines under those conditions may be had from Table 
XII, which shows the highest and lowest r’s found in each test for each 
cumulation of items. It may be remarked in passing that in each case 
where the number of determinations which we made for a given 
cumulation is 10 or more, e.g., for 710-10, 120-20, 7so-30, the empirical 


SD of the coefficients of correlations was found to be very nearly as 
Lb" 

great as the c, found by means of the formulac, = 7 in which r 

= the median r of the series. 

The variability of the reliabilities of the larger cumulations is 
especially significant from this viewpoint, because the r’s are the 
highest and lowest of only four determinations. No analysis of the 
PE’s and sigmas is attempted in this study, our purpose being mainly 
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to throw some light on the trustworthiness of the formula as it is 
ordinarily and widely used in educational literature. It appears that 
if the unitary r is reliable, the S-B prediction will not be greatly in 


. . . P . nr 11 
error in either direction where n in the formula r,, = i+(m— Dra 


is not greater than 2 or 3. 





TaBLE XII.—SnHowinc THE HIGHEST AND Lowest DETERMINATIONS OF THE 
RELIABILITIES OF VARIOUS CUMULATIONS OF COMPONENTS IN INDICATED 
TEstTs 


The First Line of the Table Shows the Number of Items in the Various Cumula- 
tions. ‘‘ Number of r’s’’ Means the Number of Reliability Coefficients Calculated 
between Pairs of Cumulations of Sizes Indicated in the First Line of the Table. 
All the Tests Here are of the True-false Form Except French Vocabulary, Which is 
of the Recognition 5—Response Form. 
































No. of items cumulated...|.........]........ 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80] 90 | 100 
Ge Ecc ccccecs shin ccecccclcsccannt 8}/10}3]41] 4 

Rights | { High |.564|.698].696).805) . 866 
ERTIES Low |.185|.372|.665|.621 - 

R-W High |.539|.624/.621|.712|.861 

RIES b 4 Low |.141|.384|. 562]. 696]. 727 
I OO 6 ni ne akin dd en aséddiainaiee 8 110} 3 4 4 
French vocabulary....... Rights fo 601] . 706]. 775). 806). 849 

Low |.426|.585|.752|.721|.7 

SE OO, oko ac ct baled oh ventiaeabee 16 71151] 6 3  Mepactoceol, & 

Neves High |.463|.626).768|.776].710|.777|....|....|.866 
Pleading and practice. ... Low |.185).389].491) .624/|.702).719)....|....|.764 

| R-W High |.480).510|.689|.688|.650|.719|....|....|.794 

f Low |.189|.374|.434].510].630].680)....|....|.736 
UN MEPs ic de cc iw ebinectdsmesci abi 6 OS Becced’ OS tledelestctecedl @ heeds ¢ 

Rights | { High |.343].549]..../.678]....]....|....|.758]....|.777 
GUM s 0c Sweesevcecess Low |.088).342)....).441)....]....]....|.624]....].720 

R-W High aae)-B19}...-)-617)..0.]200.)-02-]- 00)... ]-77e 

ad a Low |.060|.280]....1.497|....]....]....1.661]....].740 
Se EW lend ccs cc usuesedcebossebaaen 12 | 21 6 3 4 4 4 

haw’ He - 294] . 559} . 493] . 510) 700) .714) . 740] 
er Low |.011|.112|.281|.375 . 540} . 603 .523 

R-W High |.333/.510].522|.505|. 638] .626|.713 

Me dbbet Low |.048|.047].247|.333].341|.554|.577 
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DIAGNOSIS OF FEEBLE-MINDEDNESS BY 
SUBJECTIVE MEANS 


JOHN P. HERRING 
Teachers College, Columbia University 


The diagnosis of feeble-mindedness through unrecorded observation 
of behavior plus recorded generalized statements of behavior during 
individual measurement is herein studied in the case of one examiner. 
The examiner is an assistant clinical psychologist examining in State 
institutions. The subjects examined were girls in a State home; they 
are described statistically in Table I as well as verbally in the 
paragraph preceding the table. The individual examination used was 
the Porteus Maze Test, which was given rigorously according to the 
Porteus printed directions (Porteus 1919). As far as this single 20- 
minute test gave opportunity, the examiner, as she watched the 
behavior, made up her mind gradually in what degree the subject was 
feeble-minded. She, of course, used other information,’ including the 
results of a number of other tests, and she, of course, conceived 
feeble-mindedness as a scale having continuous gradation. She 
thought of feeble-mindedness as something in part quite different from 
obtained Porteus age. 

Inquiry revealed that there were definite, objective, isolable and 
identifiable acts upon which she based this diagnosis. The attempt 
was made to include in this list every relevant observation. There is 
no way of estimating precisely in what degree this end was achieved. 
There may have been elements which entered into the diagnosis but 
which for one reason or another could not subsequently be made of 
record. Judgments based upon elements of this kind are of course not 
verifiable and can form no part of a quantitative study. The conclu- 
sions of this study must be restricted to such elements as were 
included. These were listed and weighted in such a manner as to 
satisfy the following criteria: 

The list was as inclusive as possible. 

Every item finally included was judged properly included. 
Every item was assigned a reasonable weight. 

The weights were systematically and consistently distributed. 


The range of possible scores was as large as possible (0-63). 
It was possible to receive as a score any number within the range. 





1The Binet was given more weight than the Porteus. The Witmer Form- 


board, Army Designs, case histories, psychiatrical and medical findings, et cetera, 
were studied. 
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The list of definite, isolable, and identifiable acts is illustrated for 
Porteus age level 5 as follows: 


Acts Score 
TE i a Eee ses eS Og Ser AB ae 0 
Not starting at S but marking only outside the maze on second 
DLS ais'd adi d cond ton Foe as 6k ss heed taba caahd S ekdeds 2 0 


Starting at S and marking only outside the maze on second trial. 1 
Starting at S and entering any of the blind alleys on second trial. 2 
Starting at S and emerging from the second open alley but tracing 


one of the side lines on the way on second trial............... 3 
Starting at S and emerging from the first open alley but tracing one 

of the side lines on the way on second trial.................. 4 
Starting at S and emerging from the second open alley but not 

tracing any side line on second trial......................... 5 
Correct solution on second trial........................0005.. 6 


ek emaandae» tirabine Os r 


The other mazes had similar treatment, except that the list of acts 
varied somewhat with the nature of the mazes. All the mazes were 
rescored according to the code illustrated. 

These statements seem true: 

1. The following verbal description says over again what is implied 
in Table I: / 

The group consisted of 50 persons, having an average of 10.5 
mental years, and a dispersion approximately equal to that of a 12- 
year chronological age group. The group is a normal distribution of 
mental ages, showing neither significant degree of skewing nor of 
tendency to be either flattened or peaked. 


TaBie [ 
n = 50 Q = 1.756 Sk = Pw — (Pw + Pio) = —.208 
M = 10.5 Px = 13.667 Ost = .480 
go = 2.067 Pio = 8.0 Ku = =| = .315 
Qs = 12.45 D = 5.667 Cx. = .039 


Q: = 8.937 Tp = .666 


2. The difference between Porteus age and degree of feeble-minded- 
ness as subjectively estimated, was probably too small on the average 
to be detected by means of subjection observation of behavior during 
an examination, even with the aid of objective, systematic scoring. 
This is seen from Table II, in which the r between the Porteus method 
of scoring and the method here presented is .937 and the correspond- 
ing 7 is .981. The sample is of rectilinear data. 
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TaBLe II? 


r = .937 n = .981 o, = .08 
k = .35 s =? — TT? = .08 n = 50 


- 1 For » and s see Kelley (1923). 


3. The judgment that the Porteus age and feeble-mindedness as 
estimated are different traits was not substantiated by the data. 
When one is judging two traits, the rating of individuals is not the 
only thing occurring. There is a second element of great importance 
—judgment that the two traits are not the same trait. Both the 
ratings and the judgment of non-identity of traits are properly sub- 
jected to quantitative scrutiny. The burden of proof is with any who 
make such judgments. They must at the very least compute an r. 

4. The reliability of the Porteus age as determined by the 50 cases 
and by 3 other groups is represented in Table III. 1; ; wasobtained 

2 Ii 
by correlating the sum of the odd numbered scores with the sum of the 
even numbered scores. The standard deviation of the Porteus ages is 
slightly larger than that of mental ages in unselected 12-year chrono- 
logical age groups, but the difference favors the Proteus reli- 
ability coefficients in a negligible degree. The correlations of Table 
III are to be compared immediately, without correction so far as dis- 
persion is concerned with other correlations reported in age groups, ! and 





1 Pearson product-moment correlations reported in undefined groups are 
difficult to compare with coefficients reported in defined groups. To say merely 
that the r between this and that is .80 often has little meaning, but if it is added, 
e.g., that the standard deviation of the group is 10 mental months, the r at once 
becomes interpretable. It is an excellent practice to report correlations in chrono- 
logical age groups. In the case of mental age, educational age, and other com- 
parable ages, a good convention would be to report correlations either in a group 
having a standard deviation of 26 months of the kind measured, or else to report 
them in the groups available, giving the standard deviations and reporting the 
r’s that would be obtained in standard groups. A group having a standard devia- 
tion of 26 months of the kind measured may be called, for the purpose of reporting 
r’s, a standard group; and r reported in standard groups may be called a stand- 


. , ¢ WV1-R 
ard r. By extension an r estimated for a standard group by ee, way 
(Kelley, 1923) may be called a standard r, but it should always be clearly differ- 
entiated from r’s not so estimated. The standard deviation of unselected 12-year- 
olds in the United States, when it has been closely estimated, is likely to be a 


convenient one for standard groups. My best estimate for this at present is 26 
months. 
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especially with those of 12-year chronological age groups. They are 
of course most exactly comparable with those of groups having 
standard deviations equal to 2.28 mental or educational years. 

If the data presented have been properly obtained, treated and 
interpreted, and if they are representative, the Porteus Maze is 
hardly reliable enough to use by itself as basis for judgments concern- 
ing individuals in state homes for boys or girls. With a reli- 
ability as low. as r = .81 and k = .59 it should never be averaged, 
weights equal, with the Stanford Binet, for which under certain condi- 
tions r = .987 andk = .19. If the Porteus is equal in reliability, say, 
to the digits backward of the Stanford Binet, all taken together, then so 
far as the criterion of reliability is concerned, it might be allowed to 
affect the general result as much as the digits backward do in Stanford 
years 8 to Adult inclusive, where they are 4 tests out of 36. 


TaBLe III.—Portevus RELIABILITY 7’S AND o’S 
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_ 5. The corresponding reliability of the results of the rescore is 
presented in Table IV. 
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6. The standard error of estimating a true Porteus score from the 
score on one test is, for these data, one Porteus year. This means that 
in 68 per cent of the cases the errors of such an estimate are less than 12 
months and in 32 per cent of the cases greater. o+/1 — r for the 
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Stanford Revision of the Binet-Simon Tests is under defined con- 
ditions 0.44 of a mental month (Herring, 1924). 

7. Since ky; = .59, the errors of estimating the result of a second 
Porteus test from that of a first are 59 per cent as large as they would 
be by chance. 

8. The data of this study do not answer the question whether the 
Porteus Mazes would be better scored according to Porteus’ directions 
or according to the plan here presented, which was developed merely 
for the purpose of this study. This question is in part one of 
validity. 
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AN ATTENTIONAL LEARNING BOARD 
WILLIAM H. ROBERTS 


The University of Redlands 
AND 


PAUL R. FARNSWORTH 
Stanford University 


INTRODUCTION 


A much neglected phase of the learning problem is that of 
‘“ideational,’’ or in mechanistic terms ‘attentional,’ learning. In 
many situations of both animal and human experience, the subject 
constantly shifts his attention until he finally makes the attentional 
adjustment which solves the problem. At first his responses are 
mainly those of ‘‘chance.”” These are aided, however, by his attention 
to elements which, although proved at length to be incorrect, may be at 
least partially correct for some time. Suddenly the responses jump 
to perfection, and maintain themselves at that point. The mentalist 
would say that “‘the idea of the problem has been ascertained.” 

Yerkes! reports this in the case of Julius, a four or five year old 
orang utan. “Especially noteworthy, as evidences of ideation, in the 
results yielded by the multiple-choice method are (1) the use by the 
orang utan of several different methods in connection with each 
problem; (2) the suddenness of transition from method to method; (3) 
the final and perfect solution of problem 1 without diminution of the 
initial errors; (4) the dissociation of the act of turning in a circle from 
that of standing in front of a particular box.” 

In a lecture at the University of California, July 1925, W. Koehler 
mentioned a crude test of ideational learning adapted for adult humans. 
A table was divided into halves, and hammers, screw drivers, and the 
like were assembled on it. These were arranged in constantly chang- 
ing patterns. The subjects were told that with each arrangement 
there was some constant element on one-half of the table which made 
that half correct. Judgments were made and recorded and the sub- 
jects informed as to correctness. In this experiment it happened that 
correctness was associated with the shadow of an assistant. It fell on 
one side of the table, but not on the other. 


1 Yerkes, R. M.: Behavior Monographs, 3, 1916, p. 131. 
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LEARNING BoarRD 


The writers have devised a peg board which illustrates this learning 
principle and has been found suitable for classroom and laboratory 
demonstration.! It is a double board with 12 holes bored similarly on 
each side. These holes are fitted with “‘T” shaped pegs which are 
colored at one end. The pegs can be freely turned to any desired 
angle and can be inserted or removed at will. Problems of almost 
unlimited complexity can be developed. 


Two PRoBLEMS 


One of the simplest problems for a child is that of having one side of 
the board entirely filled with red pegs and the other side with green. 
The correct side may be arbitrarily designated as the side with the red 





Fig. 1.—Trials in groups of five. 


pegs. A reward in the form of a piece of candy may be offered for a 
correct judgment. The colors are shifted from side to side in a ran- 
dom order. The child hazards a guess for one side. He is informed of 
his correctness. In a very short time the normal child will attend 
to the essential element and cast all his votes correctly. 

A problem used in a laboratory course in educational psychology 
was as follows. All the holes were filled with pegs and the colors were 
matched as to left and right. In the center of each board was placed a 
blue peg. The correct side of the board was made that to which the two 





1 It is amusing to note the approximation with human subjects to the condi- 
tions that characterize animal learning. 
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blue pegs at the centers pointed. (The pegs except the center ones, 
were shifted in a random manner.) All sorts of theories were 
developed by the subjects, but gradually the correct one was reached. 

Examples of individual and class curves are given. The class 
curve is for a group of 24. Note that its shape is very different from 
that of the individual curve, as the former merely shows the increasing 
number of subjects who have the correct solution. It is somewhat 
higher at first than would occur by chance alone as incorrect theories 
may prove correct for a few times. - 


SUMMARY 


The writers believe that the attentional, or ideational, phase of 
learning has been somewhat neglected. They offer a peg board on 
which problems of varying complexity can be presented. It is believed 
that problems and standards can be developed for different age- 
groups. The use of these should constitute an interesting test for per- 
formance in inductive reasoning. 
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SocriaL CONCEPTIONS OF CHILDREN 


Children’s Social Concepts—A Study of Their Nature and Development, 
by Hymann Meltzer. New York: Teachers College, Columbia 
University, Contributions to Education, No. 192, 1925. Pp. 91. 


The major purpose of this monograph, according to the author, is 
““to discover just how some cue concepts [basic concepts of contem- 
porary life] are developed in the minds of children.’”’ But many of the 
most interesting conclusions of the study do not lie within the field 
of its major purpose. The author’s conclusion in that field is simply 
that children’s concepts of “‘democracy,” “patriotism,” “liberalism” 
and the like, are conditioned chiefly by their experience in connection 
with these words: and these experiences are largely school experiences. 
Nothing is said of how the concepts are developed except that they do 
not develop by themselves. Children do not get ideas to which they 
have not been exposed—if those ideas are outside their personal experi- 
ence. No one is apt to dispute the author’s conclusion in this matter. 

In a chapter on “‘ Meaning Associated with Concepts’ there are 
figures which furnish an interesting commentary on present practices 
in the social studies. For instance, 17 per cent of the elementary 
school children examined had no idea at all connected with the word 
“‘democracy,”’ and 60 /per cent had totally erroneous ideas. Whereas 
in connection with the word ‘‘ Americanism” only 7 per cent had no 
idea, and less than 10 per cent had totally erroneous ideas. The 
contrast between concepts of ‘‘democracy” and of ‘“‘patriotism”’ is 
even greater. Only about 5 per cent of the children reported knowing 
nothing at all about patriotism and practically none of them had 
totally erroneous ideas. 

In another interesting chapter on the “‘ Influence of the Curriculum 
on the Children’s Grasp of the Concepts” the author compares the 
scores of the children who have had conventional social studies courses 
(history, geography, and civics) with those of the children who have 
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studied a general social science course, combining history, geography, 
and civics, and emphasizing contemporary civilization. The arith- 
metic mean of the scores of the pupils who used the pamphlets was 
124.50 + 3.76, while that of pupils who had conventional courses was 
78.650 + 4.32. The contrast between the two courses of study is 
brought out even more surprisingly in the figures which show eighth 
grade pupils, who have taken the combined course, to have a higher 
mean score than that of high school seniors who have had the conven- 
tional courses. 

The list of concepts which were used throughout the study were 
selected by a careful analysis of the concepts most frequently used or 
referred to in various weekly magazines over a number of years. The 
children were questioned concerning these concepts by means of 
personal interviews. In all 300 children were interviewed. 

Aside from the general conclusions of the monograph, the list of 
concepts and the children’s ideas about them, will be of real value to 
teachers of social science; nor will the psychologist find the monograph 
barren of suggestion. Joun N. WASHBURNE. 





Tue ROLE or OpposiTEs IN MENTAL TESTING 


The Opposites Tests, by Andrew Tenant Wylie. New York City: 
Teachers College, Columbia University, 1925. Pp. XIV + 94. 


The ‘Opposites’ as a mental test have here been subjected to 
intensive examination, experimentation and revision. Several series 
of Alphabetical Opposites, high as to validity and reliability and 
graded as to difficulty, for which tentative age and grade standards 
and an adequate Key are supplied, constitute the main contribution 
of this research. Correlations were drawn with class grades and with 
various standard measures of intellectual ability. The dissertation, 
too brief from the discussional standpoint, is embellished with no 
less than 59 tables, including 16 charted scattergrams. The reader, 
searching for facts and findings is forced to wander with difficulty 
among these tables and charts and emerges at the end of the book to 
find himself rewarded not by a summary of the main points of the 
investigation or of the author’s conclusions, but by a rather unneces- 
sary one page chapter on the previously used statistical methods. 
This research, while no doubt carefully planned and executed, is 
disappointingly reported. Guiapys C. SCHWESINGER. 
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ARITHMETIC Mabe VIVID AND THRILLING 


Boy’s Own Arithmetic, by Raymond Weeks. New York: E. P. 
Dutton & Co. Pp. XV + 188. 


Until some imp intimated to the reviewer that this volume might 
be a satire on certain of the current arithmetical problem books, it 
had been viewed with solemnity as a startling vivification, at least in 
content, of a subject which has proved dry as dust to many a little 
boy. The preface reread however in the light of satire would seem to 
support this view. ‘“‘This Preface is not meant for Boys, but for 
Adults who are discontented with them—Adults who would like to 
hasten the time when, in reasoning powers, Boys shall be logical though 
young, and in manners, polite though innocent. As will be apparent 
in a moment, such a consummation can be brought about only by an 
entire change in the teaching of arithmetic.’”’ And now the reviewer 
teeters on the fence. 

A typical problem is the following: ‘‘An Oppossum intends to 
eat 3 Persimmons every other half-hour during a night of 1114 hours. 
Being weak in Arithmetic, he makes an error, and eats 4 Persimmons 
every other 20 minutes of the time indicated. If each Persimmon 
gives him 1)4 pleasure-pounds, how much does he gain or lose from 
his deficiency in Arithmetic during the month of October?”’ Whatever 
its intent, this much can be said definitely. The bubbling joyousness 
of the author in his subject is maintained throughout the book, his 
‘“‘asides’”’ to the pupil and the pen illustrations adding much to the 
interest, and to a young child with a high IQ the problems are all 
possible of solution in spite of the comment in the author’s preface 
that ‘“‘There is another and even more radical defect in all books on 
Arithmetic: they require an exact answer to every question! Could 
anything be more absurd, more unsettling, more immoral! Does 
not observation show that exact answers are the exception, rather than 
the rule, in life, and is it not for life that we train Boys?” 


F. M. Foster. 


REPRESSION AND INFERIORITY-FEAR COMPLEXES 
Childhood Fears, by G. F. Morton, M. A., B.Sc. New York: The 
MacMillan Company, 1925. Pp. 284. 


This book is largely a review and discussion of certain basic 
concepts to be found in psycho-analytical literature with which general 
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field the author seems to be closely familiar. A large part of its 
content includes quotations from the original sources, chiefly Freud, 
Jung, Adler, Pfister, Rivers and Stekel. [llustrative cases are fur- 
nished from the author’s personal experience as a school master in 
England. The fundamental forces of sex, self-preservation, herd 
instinct, organ-inferiority and the like are discussed as origins of 
disturbance, but the author’s main thesis is to show that fear, espe- 
cially repressed fear, for which drive he has coined the phrase “infe- 
riority-fear complex’”’ accounts for most of the mal-adjustments of 
the problem child in school. Morton advocates an extended use of 
psycho-analytical technique by the teacher, who as a student of 
children, and their constant associate, is better equipped to cope with 
their emotional difficulties than is the medically trained man. For 
this new practice the name “ ped-analysis”’ is suggested. 
Guapys C. SCHWESINGER. 


CAUSES AND TREATMENT OF JUVENILE DELINQUENCY 


The Young Delinquent, by Cyril Burt. New York: D. Appleton and 
Co., 1925. Pp. XV + 619. 


This is no doubt the best of all volumes current today which deal 
with the causes and concomitants of juvenile delinquency. Its 
author enjoys a rare combination of professional equipment—experi- 
ence for many years in direct psychological examination of children 
in the city of London, mastery of the techniques of mental and social 
measurement, and capacity for clear-cut use of the English language. 
The result is a readable book, full of reliable information for the profes- 
sional guardians of youth. 

‘The essentials of case study are set forth under the caption, 
Problems and Methods. Then the ancestry, environment, physique, 
intellectual equipment, temperamental traits, sentiments, emotional 
attitudes, and finally the educational psychology of the juvenile 
offender, are described in the light of statistical analysis. 

Among the major conclusions are the following: (1) delinquency is 
due to a great multiplicity and wide variety of alternative and conver- 
gent influences, instead of to one, or two, or a dozen all-powerful causes: 
(2) among main causes the majority are “‘personal,” the most signifi- 
cant being, ‘“‘first, the mental dullness, which is not severe enough to 
be called deficiency, and secondly, the temperamental instability which 
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is not abnormal enough to be considered pathological. Among social 
conditions by far the most potent is family life, and next to it, the 
friendships formed outside the home. These four conditions are 
paramount . . . Conduct and misconduct are always, in the last 
analysis, the outcome of mental life,’’ (3) the pre-school age is of 
fundamental importance for moral character and temperamental 
balance: (4) treatment for delinquency already entered upon must be 
individual, and mass treatment of offenders by inexpert persons is 
futile. 

The investigator closes his presentation of facts with the remark 
that ‘moral perfection is no innate gift, but a hard and difficult 
acquirement.”’ Would it not supplement the present volume admir- 
ably to study by the same methods an equal number of children who 


are as far above average in satisfactory conduct as the delinquents are 


below? Putting two such studies together, we should have re-enforced 
information upon which to proceed in attempts to prevent crime. 


Leta 8. HoLLIncwortu. 
Teachers College, Columbia University. 





Tue EsseNnTIALS OF Statistics MADE INTELLIGIBLE 


Statistical Method in Educational Measurement, by Arthur S. Otis. 
Yonkers: World Book Co., 1925. Pp. XI + 337. 


In a technical subject heretofore as truly difficult for teachers as it 
is indispensable to teaching, Dr. Otis has reached a new and possibly 
sufficient degree of simple exposition. In the past the author’s genius 
has been seen creating and employing again and again the examinations 
and techniques of educational measurement. Now, in Statistical 
Method in Educational Measurement, that genius appears in a clever 
inventive clarity so discriminatingly rich in devices, verbal, numerical 
and graphical, as to succeed in placing a difficult story within the 
reach of average intellects. 

Such a product seems explicable only in terms of a process 
of selective evaluation in the author’s own art of teaching which, if 
this surmise is correct, suggests a combined economy and effectiveness 
in the methodology of textbook contrivance which deserves wider 
practice. Final choice and arrangement of content and of words in 
texts is often made at large costs of trial and revision after printing; 
whereas these ends may often be achieved in manuscript by the 
repetition and alternation of these two devices—use by a very few 
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students who read to learn and note textual difficulties, and change of 
each context which unduly retards comprehension. This method Dr. 
Otis’ book seems in effect to illustrate. 

Trying to put my finger in turn upon the elements of didactics so 
successful, I touch, in the order of importance (after the author’s 
noted mastery of statistics) first, an ability to rediscover the simpler 
elements needed by a beginner, but long since buried for Otis in the 
limbo of subconscious mechanisms or of bonds become vestigial through 
use; second, a penchant for the commonest words; third, enough really 
illustrative analogy; and fourth, multiple approach to diflicult 
problems. 

Statisticians differ from one another in their preferential habits 
for the use of histograms or of ogives. While the best authorities 
appear to me somewhat to have favored histograms, and while teachers 
generally would seem to do exceedingly well even to acquire familiar 
use of the former without mastering also (or instead) ogives and per- 
centile graphs, yet the latter being not only preferred by some, but 
also at once distinctive in their utility and introductory to portions of 
the current literature, may possibly have rightful place in an 
elementary text. 

Teachers whose purposes genuinely include the education of 
children, logically welcome such an addition to the number of educa- 
tional books that are, within the bounds of reason, both sound and 
intelligible: while the other teachers gradually find fewer alibis for 
failing to know through measurement what effects they themselves as 
causes usually precede. The ascending categories of supervisory and 
administrative officials will perhaps, on the average the country over, 
welcome the book in degrees inversely proportional to the number 
of persons they command. For it seems too often true that the higher 
such officials climb, the wider their influence, and the longer since 
their professional training (it often was training); then, exceptions 
excepted, the more antiquated is their insight, and the more inhibitory 
is their effect upon youth with its newer education and its fresher plans. 

Purposes vary so often and so much that there could never be in an 
absolute sense a complete book; moreover the completeness of a book 
may not be the best thing about it. However that may be, this volume 
seems in its field successfully calculated to meet most of the earlier, 
simpler needs of teachers learning to estimate and improve their 
influence. JoHN P. HERRING. 
Teachers College, Columbia University. 
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A CriticaL Stupy or READING METHODS 


Teaching Beginners to Read in England: Its Methods, Results, and 
Psychological Bases, by W. H. Winch, M.A. Journal of Educa- 
tional Research Monographs, Number 8, B. R. Buckingham, 
Editor. Bloomington, Ill.; Public School Publishing Company, 
1925. Pp. 185. 


Whatever else may be said of this sequence of studies, it should be 
brought to the attention of prospective research workers. This 
monograph is indeed so remarkable as an example of experimental 
technique that it will be read with rare delight by those who appreciate 
refinements of experimental method. Not only do the studies satisfy 
and exemplify certain essential criteria of scientific procedure but the 
report sets down the experimenter’s hypothesis, method and findings 
with exceptional clarity and shows how a complex series of problems 
was resolved into a sequence of clearly defined issues which were sub- 
jected to controlled experimental attack. It would however be unwise 
to use the monograph as a model without calling attention to certain 
crucial defects which not only reduce the significance of the conclusions 
but exemplify the factors to which the expert technician is too often 
blind. 

Here we have a meticulous comparison of certain reading methods 
still commonly used in the Infant Schools of England. Why is it that 
we cannot enthuse over the conclusions or worry lest the data lead 
American educators to revert to an English counterpart of such 
almost forgotten systems as the Ward or the Funk and Wagnalls 
Standard? If it is not scientific ineptitude which makes us skeptical, 
what is it? 

First, we cannot accept the author’s definition of beginning 
reading. By statement and implication he defines reading as word- 
naming. He sets up his materials, his methods and his tests in the 


light of his definition but, after practically preventing any tendency 


to“notice phrase or sentence meaning, remarks that it was rare for a 
child to realize that the words had any connected meaning and con- 
cludes that ‘‘the actual process of learning to read put out of action 
any attention pupils might have been able to give to the subject- 
matter.’”’ That the content of the lessons themselves conspired with 
the method to discourage any tendency to thought-getting will appear 
from the following excerpt from a “‘lesson”’: 

Amancanraponatap. A ladcan pata cat. 

Acatcannaponalap. Aladcan nap ona mat, etc. 
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Mr. Winch however assures us that the children found the work 
interesting. His hypothesis assumes that ‘interest in meaning is 
adult.”” The comments of the children cited as evidences of the 
spontaneity and interest induced by the preferred method will strike 
most persons as evidence of the pathetic docility of pupils. Those who 
have observed the joyousness of an approach which uses selections based 
on first hand experiences and have witnessed the ability of beginners 
to attend to meaning when given a chance, will be duly critical of 
instruction which consists of nothing but drill on abstractions and 
content. which frustrates attention to meaning by using words 
which are so extremely similar that they actually caused every child 
in the experimental group to attack each successive word of the tests 
analytically, notwithstanding the fact that 53 of the 72 words had been 
taught and that certain very common words occurred more than once 
in the tests. Thus the method actually fixed habits which hamper 
subsequent growth and must be unlearned. It sacrificed everything 
to mechanical efficiency in analyzing and naming simple words. On 
taught words the superiority of the preferred method was a difference 
of .3 of a word or 14 as great as the PE—concededly an insignificant 
difference. On untaught words the difference rose to 5.2, but the 
difference was reversed when the phonoscript group was required to 
use ordinary print. The average scores on 53 taught words in the 
test passages were 37.5 for the phonic and 37.8 for the phonoscript 
method. The per cent of accuracy in either case on taught words is 
about 70. On untaught words the phonoseript method, which uses 
printed material with diacritical markings, scored 11.5 out of a possible 
19 while the phonic method scored only 6.3. When the phonscript 
group was required to read the same words in changed order in another 
test they were at a distinct disadvantage. The average scores on 
untaught words was reduced to 2.5. In subsequent tests at intervals 
of six months results show that dependence on phonoscript cues 
decreases gradually. In the opinion of Mr. Winch the lack of early 
transference is offset by high accuracy of recognition attained after 18 
months of instruction. In a series of three test paragraphs, a total of 
168 words, only 7 of which had not been taught during the 18 month 
interval, the average number of misrecognitions on ordinary print was 
13.9 or about 8 per cent. No rate measure is given for this test 
and there was no test of comprehension throughout the experiment, 
but there is ample evidence that both the phonic and the phonoscript 
methods lead to very slow reading at the outset. After six months 
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of instruction the average time required to “read” 72 words was 14.9 
minutes by the phonic method and 17.8 minutes for the phonscript. 
The two methods then, after six months of work result in rates of 
4.8 and 4 words per minute, with an average misrecognition of 16 out 
of 53 taught words, not to mention errors on new words. The most 
rapid reader took 8 minutes to “read’’ 72 words, a rate of 9 words 
per minute! We should need no graphic record of the eye movements 
of these cases to realize that the performance can hardly qualify as 
reading. 

In a previous chapter of the monograph the phonic and “Look and 
Say”? methods were compared. The average scores on a test 
paragraph made up of 30 taught words were 22.2 correct recognitions 
in 6 minutes and 17 seconds for the phonic method, and 17.3 correct 
recognitions in 3 minutes and 31 seconds for the “Look and Say” 
method. Thus in no case was an average rate of 10 words per 
minute secured and the errors are not only numerous but show 
gross disregard and mutilation of meaning to be the rule. 

There is also data to show that none of the methods can be 
depended on to teach spelling. 

While it is hard to find strictly comparble data on the outcomes of 
instruction in reading in American schools the study gives us new 
confidence in the broader aims of primary reading in progressive 
American schools. It also shows how meagre the returns are when a 
rigidly systematic approach is used with children of kindergarten age. 
It provides in addition a basis for interpreting the reports of American 
visitors in English primary schools and the means of making objective 
comparisons in the near future. From such comparisons only the 
least progressive methods have any reason to shrink. 

LAURA ZIRBES. 
Teachers College, Columbia University. 





A New ApproacH To ABNORMAL PsyYCcHOLOGY 


An Introduction to Objective Psychopathology, by G. V. Hamilton, with 
a foreword by Robert M. Yerkes. St. Louis: The C. V. Mosby 
Co., 1925. Pp. 354. 


Educational psychologists will be glad to find at last a medical 
examiner of mentally aberrant persons, who can relate his findings to 
what is known of normal psychology. Dr. Hamilton has taken the 
pains to study not only normal habit formation in human beings, but 
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also to observe experimentally the same phenomena in infra-human 
animals, in rodents, dogs, monkeys and baboons. He knows at first 
hand that these creatures can learn to be either “well-adjusted”’ or 
“eecentric,’”’ according to the way in which the experimental situation 
may be engineered. 

Save for the mental disorders which follow from organic disease of 
tissues constituting the nervous system and its closely associated 
mechanisms, the ‘‘symptoms”’ of insanity and nervousness are recog- 
nized as habits, acquired according to the established laws of learning. 
The descriptive adjective “insane” means merely that the habit is 
valueless or positively harmful for the biological or social survival of 
the organism. People learn to be insane, eccentric, disagreeable, just 
as they learn to talk, to be patriotic, or to conform to customary 
manners. 

Years ago Thorndike showed that a monkey will learn to run from 
food when hungry, instead of toward it, if only the situation be manipu- 
lated by the experimentalist so that satisfaction always follows the 
retreat. Even so, a human being will learn to shrink from good health, 
and to seek ‘‘the status of a physically impaired person,” if only life 
be so arranged that satisfactions or relief from annoyances may be 
achieved by invalidism. In both cases the laws of learning are the 
same; the end result is a valueless or harmful habit; and the princi- 
ples of ‘‘cure’’ are simply the principles of re-education. 

The laws of learning alone, however, do not account for the fact 
that some persons learn to be insane, while others do not, in similar 
“baffling situations.” Dr. Hamilton defines and faces this problem, 
but does not develop it as far as it is, perhaps, capable of development 
in the light of available knowledge in the field of individual differences. 
The selected bibliography upon which his discussion is founded 
scarcely shows the familiarity with research in the field of differential 
psychology which we might have expected from citation of such 
names as Yerkes, Thorndike, and Binet. 

The citation of these investigators and of others who have built up 
our knowledge of animal psychology, including the human in its 
normal aspects, is practically unprecedented in works by medical 
psychiatrists. Perhaps the publication of this book means the 
beginning of a time when psychiatry will be based upon the study of 
the normal mind as a foundation, instead of upon the present curric- 
ulum of the medical school, which is practically irrelevant to an 
understanding of the mental disorders which are not due to organic 
disease of nervous tissue. Leta S. HoLuincwortns. 
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A REFERENCE Book ON READING 


Summary of Investigations Relating to Reading, by William Scott Gray, 
Supplementary Educational Monographs, Number 28. Chicago: 
The University of Chicago, 1925. Pp. 275. 


Notwithstanding the exceeding fertility of research in the field of 
reading during the last quarter century conclusive evidence on certain 
moot questions is still lacking. Research workers in that field have 
been rendered a signal service in a recent monograph, made possible 
by a subvention from the Commonwealth Fund and the careful 
critical work of Dr. William Scott Gray. An annotated bibliography of 
436 studies shows how comprehensive the summary of investigations 
is. A glance at 17 chapter headings shows how the studies have 
been grouped and analyzed for their conclusions. Thus the student, 
investigator or critical worker in the field may quickly locate the discus- 
sion of any topic with which he is concerned and come into immediate 
contact with the conclusions of those who have studied the problem 
experimentally from various angles and by various means. When the 
conclusions are contradictory or inconclusive that fact is not stated 
and interpreted, but studies which would provide adequate data are 
proposed. - By a critical attack on half solved problems the necessity 
for dependence on opinion could be rapidly reduced and controversial 
issues would be settled upon a scientific basis. Waste and duplication 
of effort should thus be reduced and problems of prime significance 
should receive early attention. Meanwhile this monograph and the 
annual supplements which summarize subsequent studies confirm- 
ing present findings or reporting bases for altered conclusions, should 
prove as stimulating to research as Part I of the Twenty-fourth 
Yearbook! has proved to elementary school practice. 


LauRA ZIRBES. 
Teachers College, Columbia University. 





1 “Twenty-fourth Yearbook of the National Society for the Study of Educa- 
tion, Part I.” Bloomington, Ill., Public School Publishing Co., 1925. 
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